Introduction |
|||||||
XXX this page needs updating XXX Psyco is a specializing compiler. In a few words let us first see: |
|||||||
What you can do with it | |||||||
In short: run your existing Python software much faster, with no change in your source. Think of Psyco as a kind of just-in-time (JIT) compiler, a little bit like Java's, that emit machine code on the fly instead of interpreting your Python program step by step. The result is that your unmodified Python programs run faster. Benefits
Drawbacks
|
|||||||
Expected results | |||||||
The actual performance gains can be very large. For common code, expect at least a 2x speed-up, more typically 4x. But where Psyco shines is when running algorithmical code --- these are the first pieces of code that you would consider rewriting in C for performance. If you are in this situation, consider using Psyco instead! You might get 10x to 100x speed-ups. It is theoretically possible to actually speed up this kind of code up to the performance of C itself. Because of the nature of Psyco, it is difficult to forecast the actual performance gains for a given program. Just try and see. The memory overhead of Psyco is currently quite large. I don't know yet if it will be seriously reduced in the future, but I have a few ideas in this direction. This overhead is proportional to the amount of Python code that Psyco rewrites; thus if your application has a few algorithmic "core" functions, these are the ones you will want Psyco to accelerate --- not the whole program. In the future, Psyco will be transparently linked with one of the Python profilers and automatically select which functions it is interesting to accelerate. |
|||||||
Differences with traditional JIT compilers | |||||||
Unlike the Java JITs, which writes one machine-code version of each of your function and delivers a constant speed-up (typically around 2x), Psyco uses the actual run-time data that your program manipulates to write potentially several versions of the machine code, each differently specialized for different kinds of data. Depending on how well it can do it, you can get smaller or higher speed-ups. In extreme cases, when all computations can be done in advance, nothing remains to be done at run-time. There is no static analysis of your program, no separated compilation phase, no type inference. It is all done at run-time: Psyco infers from the values your program manipulates some restrictions about the variables, like always containing an integer, or a list of strings of length 1, or a tuple whose first item is zero. Using these restrictions, efficient machine code can be emitted. Only this second phase is similar to what a C or a JIT compiler does. In C or in Java the restrictions are static: they come from the programmer's type declarations. The purpose of Psyco is to try and dynamically build the restrictions that will produce the best code for the currently-manipulated data. The flexibility comes from the fact that if any data that does not fit is later found, new machine code can be emitted. It means that your program works in all cases, but the common case gets the fastest code without the overhead of having to care about the exceptional cases. In this perspective we can theoretically expect faster results than what you get with low-level languages: programs optimized for the data that it currently handles. |
|||||||
My goals | |||||||
My goal in programming Psyco is to contribute to reduce the following wide gap between academic computer science and industrial programming tools. While the former develops a number of programming languages with very cool semantics and features, the latter stick with low-level languages principally for performance reasons, on the ground that the higher the level of a language, the slower it is. Althought clearly justified in practice, this belief is theoretically false, and even completely inverted --- for large, evolving systems like a whole operating system and its applications, high-level programming can deliver much higher performances. The new class of languages called "dynamic scripting languages", of which Python is an example, is semantically close to long-studied languages like Lisp. The constrains behind their designs are however different: some high-level languages can be relatively well statically compiled, we can do some type inference, and so on, whereas with Python it is much harder --- the design goal was different. We now have powerful machines to stick with interpretation for a number of applications. This, of course, contributes to the common belief that high-level languages are terribly slow. Psyco is both an academic and an industrial project. It is an academic experiment testing some new techniques in the field of on-line specialization. It develops an industrially useful performance benefit for Python. And first of all it is a modest step towards:
|
|||||||