How To Reduce Memory Consumption While Exporting Data?

Want create site? Find Free WordPress Themes and plugins.

In one of the projects – a web application to be more precise, I worked on exporting lead and enquiry data using the traditional approach. What I did was, process the entire data at once and render it in the desired file format such as .csv or .xls. However, when it came to exporting large amounts of data, there were a couple of issues that arose in the form of memory exhaustion and exceeding of execution time in terms of performance.

The need, therefore, was to make the whole system more scalable in order to make it possible to export large datasets at a given time within a minimum response time. And further, improve memory usage performance.

 

What We Did To Improve Memory Usage Performance?

While exporting a large amount of dataset, we tried a different approach i.e.  Using the Generator Function. A Generator Function yields values, which can be iterated over; unlike a normal function that returns a value. 

With generators, you can write code that lazily iterates over data, without having to build an array. Building an array can contribute to excessive memory consumption as well as increase processing time; a generator, on the other hand, does not utilize excessive memory. Therefore, it will enable the memory to be preserved when operating with huge sets of data.

Although, integrating this can be a long process. However, there is a way around it –  implementing the Queue or any Asynchronous technique, increasing the max execution time as well as running it from CLI. This will help us avoid more consumption of memory and we can perform any sort of operations like Export, Import, etc.

Assuming you have a User model with 10M+ entries in the database and you want to iterate over it in your code in order to export data, you should use a generator (as depicted in the below code) instead of just calling User::all()

 

Log::info(“Memory usage before query execution:”.memory_get_usage());

Log::info(“Initial time before query execution: “.json_encode(round(microtime(true) * 1000)));

foreach ($userDetails->cursor() as $user) {

    yield [

        ‘Name’ => $user->full_name,

        ‘Email’ => $user->email,

        ‘Mobile Number’ => $user->mobile_number,

        ‘DOB’ => $user->dob,

        ‘Gender’ => $user->gender,

        ‘Address1’ => $user->address1,

        ‘Address2’ => $user->address2,

        ‘State’ => $user->state_name,

        ‘District’ => $user->district_name,

        ‘PAN Number’ => $user->pan_number,

        ‘Bank Name’ => $user->bank_name,

        ‘Bank Branch’ => $user->bank_branch,

        ‘Account Number’ => $user->account_number,

        ‘IFSC Code’ => $user->ifsc_code,

        ‘Company Name’ => $user->company_name,

        ‘Branch Location’ => $user->branch_location,

        ‘Approval Status’ => $user->approval_status,

    ];

}

Log::info(“Memory usage after query execution:”.memory_get_usage());

Log::info(“Total time after query execution: “.json_encode(round(microtime(true) * 1000)));

return $userDetails;

 

Also, you can analyse memory usage while task execution by using memory_get_usage () and memory_get_peak_usage() functions before and after query execution and analyse time consumed by the task by using php time functions for comparing response time before and after task execution.

 

What Was The Impact?

We noticed a couple of changes and went on to implement them in other modules/projects as well. Below was the impact of using the Generator Function:

  • Increased performance 
  • Reduced memory utilization
  • Reduced resource utilization
  • Reduced memory exhausted issues while exporting 
  • Reduced memory exceeds the execution time issue
  • Freed up the used memory after task completion

Some Best Practices To Keep In Mind When You Go About It

  • Use Generators to export large datasets.
  • Avoid fetching large sets as an array. Fetch via each row, process it and then move to the next.
  • Avoid very complex scripts, especially if including nested loops.
  • Avoid unnecessary variables declaration and remove unused variables.
  • Unset variables/array after using i.e., Garbage Collection
  • Analyse memory usage while task execution by using memory_get_usage () and memory_get_peak_usage() functions before and after query execution
  • Analyse time consumed by the task by using php time functions for comparing response time before and after task execution.

 Tools & Libraries Used: https://github.com/rap2hpoutre/fast-excel

 

If you have more ideas/suggestions on how to reduce memory consumption while exporting large sets of data; do share your feedback below.

 

 

Did you find apk for android? You can find new Free Android Games and apps.

Tags

Related Posts

Leave A Comment