Data processing for human beings

Easily write code to clean, transform, explore and visualize data using python.

Get started fast:

pip install pyoptimus

Get the Optimus Book, written by its creators.πŸ“•

This book explains how to load, prepare and join data at any scale. It is written by its creators who use this methodology in their day-to-day work.

What can Optimus do for you?

  • Easy to write and easy to read.

    Not tech-savvy? not a problem! Write transformation instruction in plain English.
  • No matter your data size.

    Process a small dataset on your laptop or use a cluster to process Big data. You can use Pandas, Dask, cuDF, Dask-cuDF, Vaex or Spark to process your data.
  • Open-source

    Download and use it anywhere. (Apache 2.0 License)

Easy API

 In a little more than 10 lines you can, remove white spaces, accents in all columns, lowercase all columns data, drop a "dummyCol", transform date format, sort a column, convert integers to a "string", replace "taco" per "taaaccoo" and "pizza" per "pizzza"Β  

Optimus transformation example

Connect to files and databases

Load and save locally (or remotely) Excel, CSV, JSON, parquet, Avro. Get and insert data from Mysql, Redshift, SQL Server, Postgres, Oracle, Casandra, and Presto.Β  

Optimus loading database example

Advanced features

All you need to handle your data in one place.
  • Outlier Detection

    Easily detect outlier with Out of box functions.
  • Machine Learning

    Apply linear regression, logistic regression or K-means easily.
  • String Clustering

    Cluster similar strings and change it for single value.
  • NLP functions

    Stem and Lemmatize verbs, Tokenize strings, word count, remove diacritics, expand contratect words and more.

Used by Forward thinking companies

Here are a few of our favorites!

What People Say

  • β€œThe group of BBVA Data & Analytics in Mexico has been using Optimus for the past months, and we have boosted our performance for cleansing, exploring and analyzing our data by 10x factor.”

Featured On

F.A.Q.

  • How is Optimus different from Pandas, Dask, cuDF, etc.?

    Think Optimus as a universal way to access many of the dataframe technologies available in python. Optimus can works with Pandas, Dask, Spark, Vaex, cuDF, and Dask-cudf as backend.

  • Why so many data processing engines?

    Although most dataframe API tries to mimic Pandas there are always little differences in the way these dataframes work. With Optimus, we want to let you code and then use the technology and infrastructure available to you to process your data.

  • How can Optimus use CPUs and GPUs?

    For CPU, Optimus can use Pandas, Dask, Spark, or Vaex. For GPUs, Optimus relies on cuDF and Dask-cuDF.

  • Not sure if I understand between Pandas, Dask, cuDF... and Optimus. Can you explain further?

    Optimus focused on give you the best tools for all your data processing needs. From data quality, plotting, parsing dates, URLs, email, and NLP preparation.

    Optimus give you the best performance, so you don't have to reinvent the wheel.

Join Our our not disturbing Newsletter

Want to know about new releases and how you can help Optimus?
Error. Your form has not been submittedEmoji
This is what the server says:
There must be an @ at the beginning.
I will retry
Reply
Runs on Unicorn Platform