We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How we used vectorization for 1000x Python speedups (no C or Spark needed!)

Formal Metadata

Title
How we used vectorization for 1000x Python speedups (no C or Spark needed!)
Title of Series
Number of Parts
131
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Want to make all your code faster? With matrices, library knowledge, and a sprinkle of creativity, you can consistently speed up multivariate Python functions by 1000x! Modal optimization requires simple axioms - arithmetic, checking a case, calling the right sklearn function, and so on. When that’s not sufficient, three core tricks - converting conditional logic to set theory, stacking vectors into a matrix, and shaping data to match library expectations - cover the vast majority of real world cases (90% of the ~400 functions we vectorized). At Bloomberg, ESG (Environmental, Social, and Governance) Scores require complex computations on large data sets. Time-series computations are fundamental for Governance - one UDF infers board support for a policy from prior cyclical votes and other time offset inputs. By rewriting the pandas backfill as a series of reductions on a 4-tensor, we reduced the runtime from 45 minutes to 10 milliseconds! Analogously, due to real world complexity, finance UDFs can end up with 100+ if/else branches in one function. With a mix of De Morgan’s laws and sparse matrix representations, we simplified the cases and achieved 1000x+ speedups. We’ll conclude with a quick overview of cutting-edge tools, and hope you’ll leave with a concrete strategy for vectorizing financial models!