Download PDF

2023 · workshop paper · International Workshop on Smalltalk Technologies · Lyon, France · CEUR-WS

Abstract

DataFrame is a tabular data structure for data analysis. It is a two-dimensional table (similar to a spreadsheet) with an extensive API for querying and manipulating the data. Data frames are available in many programming languages (e.g., pandas in Python or data.frame in R), they are the go-to tools for data scientists and machine learning practitioners. Pharo DataFrame was first released in 2017. Since then, the library underwent many changes and improvements. In this paper, we present the Pharo DataFrame library, show examples of its usage, and compare its API to that of pandas. We overview the changes that have been made since DataFrame v1.0, discuss the limitations of the current implementation, and present the roadmap for future.

Keywords

pharo, dataframe, data analysis, data structure

Links

BibTeX

@inproceedings{safina2023pharo,
  title = {Pharo DataFrame: Past, Present, and Future},
  author = {Safina, Larisa and Zaitsev, Oleksandr and Ferlicot-Delbecque, Cyril and Sow, Papa Ibrahima},
  year = {2023},
  month = {August},
  booktitle = {International Workshop on Smalltalk Technologies},
  publisher = {CEUR-WS},
  address = {Lyon, France},
  url = {https://ceur-ws.org/Vol-3627}
}