Data processing for human beings

Easily write code to clean, transform, explore and visualize data using python.

Install Optimus now! 🚀

Get started fast:

pip install pyoptimus

Get the Optimus Book, written by its creators.📕

This book explains how to load, prepare and join data at any scale. It is written by its creators who use this methodology in their day-to-day work.

Purchase on Amazon

What can Optimus do for you?

Easy to write and easy to read.

Not tech-savvy? not a problem! Write transformation instruction in plain English.
No matter your data size.

Process a small dataset on your laptop or use a cluster to process Big data. You can use Pandas, Dask, cuDF, Dask-cuDF, Vaex or Spark to process your data.
Open-source

Download and use it anywhere. (Apache 2.0 License)

Easy API

In a little more than 10 lines you can, remove white spaces, accents in all columns, lowercase all columns data, drop a "dummyCol", transform date format, sort a column, convert integers to a "string", replace "taco" per "taaaccoo" and "pizza" per "pizzza"

Connect to files and databases

Load and save locally (or remotely) Excel, CSV, JSON, parquet, Avro. Get and insert data from Mysql, Redshift, SQL Server, Postgres, Oracle, Casandra, and Presto.

Advanced features

All you need to handle your data in one place.

Outlier Detection

Easily detect outlier with Out of box functions.
Machine Learning

Apply linear regression, logistic regression or K-means easily.
String Clustering

Cluster similar strings and change it for single value.
NLP functions

Stem and Lemmatize verbs, Tokenize strings, word count, remove diacritics, expand contratect words and more.

Used by Forward thinking companies

Here are a few of our favorites!

What People Say

“The group of BBVA Data & Analytics in Mexico has been using Optimus for the past months, and we have boosted our performance for cleansing, exploring and analyzing our data by 10x factor.”

Featured On

F.A.Q.

How is Optimus different from Pandas, Dask, cuDF, etc.?

Think Optimus as a universal way to access many of the dataframe technologies available in python. Optimus can works with Pandas, Dask, Spark, Vaex, cuDF, and Dask-cudf as backend.
Why so many data processing engines?

Although most dataframe API tries to mimic Pandas there are always little differences in the way these dataframes work. With Optimus, we want to let you code and then use the technology and infrastructure available to you to process your data.

How can Optimus use CPUs and GPUs?

For CPU, Optimus can use Pandas, Dask, Spark, or Vaex. For GPUs, Optimus relies on cuDF and Dask-cuDF.
Not sure if I understand between Pandas, Dask, cuDF... and Optimus. Can you explain further?

Optimus focused on give you the best tools for all your data processing needs. From data quality, plotting, parsing dates, URLs, email, and NLP preparation.

Optimus give you the best performance, so you don't have to reinvent the wheel.

Join Our our not disturbing Newsletter

Want to know about new releases and how you can help Optimus?

Get started fast:

Get the Optimus Book, written by its creators.📕

What can Optimus do for you?

Easy to write and easy to read.

No matter your data size.

Open-source

Easy API

Connect to files and databases

Advanced features

Outlier Detection

Machine Learning

String Clustering

NLP functions

Used by Forward thinking companies

What People Say

Featured On

F.A.Q.

How is Optimus different from Pandas, Dask, cuDF, etc.?

Why so many data processing engines?

How can Optimus use CPUs and GPUs?

Not sure if I understand between Pandas, Dask, cuDF... and Optimus. Can you explain further?

Join Our our not disturbing Newsletter