There is this old chestnut of wisdom, that Perl programmers shouldn't care about the memory and CPU efficiency of their programs, or to put it differently "if you care about memory and CPU efficiency, use C and write your own garbage collector".
I often wondered about the wisdom of that advice, so one day I've done the opposite and started using Perl for memory, speed and CPU cycle intensive tasks. I don't want to write more C than I can help, I want to use Perl, since I like programming in Perl. So why should I care about speed and memory usage until I have to?
Certain use cases excluded, like embedded programming where memory and CPU power is extremely limited, on mainstream desktop and server hardware it is a good first approximation to start coding in Perl and improve on efficiency as needed. If Perl is your kind of language then avoid the temptation of deciding that Perl can't possibly cut it for high performance environments.
I often wondered about the wisdom of that advice, so one day I've done the opposite and started using Perl for memory, speed and CPU cycle intensive tasks. I don't want to write more C than I can help, I want to use Perl, since I like programming in Perl. So why should I care about speed and memory usage until I have to?
Certain use cases excluded, like embedded programming where memory and CPU power is extremely limited, on mainstream desktop and server hardware it is a good first approximation to start coding in Perl and improve on efficiency as needed. If Perl is your kind of language then avoid the temptation of deciding that Perl can't possibly cut it for high performance environments.
Some general guidelines apply of course: avoiding premature optimization and having a good design helps write better programs in all languages.
Talking about memory usage, for my use case initial memory footprint is much less important than the need to maintain a steady, near constant usage of it as I'm dealing with long running processes that do a lot of work.
Fortunately sticking to a few basics, it is much less of a challenge than expected.to achieve exactly just that. Perl uses a reference count based garbage collection, which means that the elephant-in-the-room mistake to watch out for is creating circular references. Reference count based garbage collection reclaims memory when the reference count of a variable reaches zero. If variable A points to B and B points to A, the reference count remains higher than zero and the memory never gets reclaimed, resulting in a memory leak.
A solution is either to avoid creating circular references, remember to explicitly break them or what's even better: use Scalar::Util's weaken, to designate one of the references to be weak, that is to mark the referenced variable free for destruction if only weak references point to it.
It is possible that despite our best efforts some code leaks. There are a variety of ways to deal with that:
On the CPU cycle front avoiding premature optimization is JIT for programmers. Optimize when you have to, have enough resources to do so and when you have context where to start.
Context means information and information comes from profiling. For profilers in Perl there is only one reasonable choice, not because the lack of alternatives, but because Devel::NYTProf is that good. It is fast, versatile, robust and very informative.
By using Devel::NYTProf the information allowed me to write some high performance code that looks pretty interesting when it comes to CPU cycle usage. Most of the cycles (80%+) are spent in well-written C libraries, even though I've written most of the code in Perl, with the exception of contributing some to an XS module. I've got to eat my cake and keep it, too.
So, you like writing code in Perl? Then enjoy life, don't worry and JFDI. It might turn out much better than expected.
* You can always count on Microsoft to provide an implementation of something that happily trods off the beaten path thereby exercising corner cases.
Talking about memory usage, for my use case initial memory footprint is much less important than the need to maintain a steady, near constant usage of it as I'm dealing with long running processes that do a lot of work.
Fortunately sticking to a few basics, it is much less of a challenge than expected.to achieve exactly just that. Perl uses a reference count based garbage collection, which means that the elephant-in-the-room mistake to watch out for is creating circular references. Reference count based garbage collection reclaims memory when the reference count of a variable reaches zero. If variable A points to B and B points to A, the reference count remains higher than zero and the memory never gets reclaimed, resulting in a memory leak.
A solution is either to avoid creating circular references, remember to explicitly break them or what's even better: use Scalar::Util's weaken, to designate one of the references to be weak, that is to mark the referenced variable free for destruction if only weak references point to it.
It is possible that despite our best efforts some code leaks. There are a variety of ways to deal with that:
- A large array of helper modules are available, like Devel::Leak, Devel::TrackObjects, Devel::Size, Devel::Cycle just to mention a few.
- If you suspect that perl itself or an XS library leaks memory then a memory profiler like Valgrind might be useful.
- Memory leaks can be tricky to find even with all the tools available and personally I found that the best approach in dealing with them is not a tool but a method. Triangulating a memory leak by building a list of observations about your program from as many independent variables as possible and using a bit of logic to deduce how, under what circumstances exactly your program leaks (this is something like doing git bisect, but instead of commits you deal with observations, with the added difficulty of having to sort those observations to see a pattern).
On the CPU cycle front avoiding premature optimization is JIT for programmers. Optimize when you have to, have enough resources to do so and when you have context where to start.
Context means information and information comes from profiling. For profilers in Perl there is only one reasonable choice, not because the lack of alternatives, but because Devel::NYTProf is that good. It is fast, versatile, robust and very informative.
By using Devel::NYTProf the information allowed me to write some high performance code that looks pretty interesting when it comes to CPU cycle usage. Most of the cycles (80%+) are spent in well-written C libraries, even though I've written most of the code in Perl, with the exception of contributing some to an XS module. I've got to eat my cake and keep it, too.
So, you like writing code in Perl? Then enjoy life, don't worry and JFDI. It might turn out much better than expected.
* You can always count on Microsoft to provide an implementation of something that happily trods off the beaten path thereby exercising corner cases.

Leave a comment