Creating a development environment — pandas 2.2.1 documentation (2024)

To test out code changes, you’ll need to build pandas from source, whichrequires a C/C++ compiler and Python environment. If you’re making documentationchanges, you can skip to contributing to the documentation but if you skipcreating the development environment you won’t be able to build the documentationlocally before pushing your changes. It’s recommended to also install the pre-commit hooks.

Step 1: install a C compiler#

How to do this will depend on your platform. If you choose to use Docker or GitPodin the next step, then you can skip this step.

Windows

You will need Build Tools for Visual Studio 2022.

Note

You DO NOT need to install Visual Studio 2022.You only need “Build Tools for Visual Studio 2022” found byscrolling down to “All downloads” -> “Tools for Visual Studio”.In the installer, select the “Desktop development with C++” Workloads.

Alternatively, you can install the necessary components on the commandline usingvs_BuildTools.exe

Alternatively, you could use the WSLand consult the Linux instructions below.

macOS

To use the mamba-based compilers, you will need to install theDeveloper Tools using xcode-select --install.

If you prefer to use a different compiler, general information can be found here:https://devguide.python.org/setup/#macos

Linux

For Linux-based mamba installations, you won’t have to install anyadditional components outside of the mamba environment. The instructionsbelow are only needed if your setup isn’t based on mamba environments.

Some Linux distributions will come with a pre-installed C compiler. To find outwhich compilers (and versions) are installed on your system:

# for Debian/Ubuntu:dpkg --list | grep compiler# for Red Hat/RHEL/CentOS/Fedora:yum list installed | grep -i --color compiler

GCC (GNU Compiler Collection), is a widely usedcompiler, which supports C and a number of other languages. If GCC is listedas an installed compiler nothing more is required.

If no C compiler is installed, or you wish to upgrade, or you’re using a differentLinux distribution, consult your favorite search engine for compiler installation/updateinstructions.

Let us know if you have any difficulties by opening an issue or reaching out on our contributorcommunity Slack.

Step 2: create an isolated environment#

Before we begin, please:

  • Make sure that you have cloned the repository

  • cd to the pandas source directory you just created with the clone command

Option 1: using mamba (recommended)#

mamba env create --file environment.ymlmamba activate pandas-dev

Option 2: using pip#

You’ll need to have at least the minimum Python version that pandas supports.You also need to have setuptools 51.0.0 or later to build pandas.

Unix/macOS with virtualenv

# Create a virtual environment# Use an ENV_DIR of your choice. We'll use ~/virtualenvs/pandas-dev# Any parent directories should already existpython3 -m venv ~/virtualenvs/pandas-dev# Activate the virtualenv. ~/virtualenvs/pandas-dev/bin/activate# Install the build dependenciespython -m pip install -r requirements-dev.txt

Unix/macOS with pyenv

Consult the docs for setting up pyenv here.

# Create a virtual environment# Use an ENV_DIR of your choice. We'll use ~/Users/<yourname>/.pyenv/versions/pandas-devpyenv virtualenv <version> <name-to-give-it># For instance:pyenv virtualenv 3.9.10 pandas-dev# Activate the virtualenvpyenv activate pandas-dev# Now install the build dependencies in the cloned pandas repopython -m pip install -r requirements-dev.txt

Windows

Below is a brief overview on how to set-up a virtual environment with Powershellunder Windows. For details please refer to theofficial virtualenv user guide.

Use an ENV_DIR of your choice. We’ll use ~\\virtualenvs\\pandas-dev where~ is the folder pointed to by either $env:USERPROFILE (Powershell) or%USERPROFILE% (cmd.exe) environment variable. Any parent directoriesshould already exist.

# Create a virtual environmentpython -m venv $env:USERPROFILE\virtualenvs\pandas-dev# Activate the virtualenv. Use activate.bat for cmd.exe~\virtualenvs\pandas-dev\Scripts\Activate.ps1# Install the build dependenciespython -m pip install -r requirements-dev.txt

Option 3: using Docker#

pandas provides a DockerFile in the root directory to build a Docker imagewith a full pandas development environment.

Docker Commands

Build the Docker image:

# Build the imagedocker build -t pandas-dev .

Run Container:

# Run a container and bind your local repo to the container# This command assumes you are running from your local repo# but if not alter ${PWD} to match your local repo pathdocker run -it --rm -v ${PWD}:/home/pandas pandas-dev

Even easier, you can integrate Docker with the following IDEs:

Visual Studio Code

You can use the DockerFile to launch a remote session with Visual Studio Code,a popular free IDE, using the .devcontainer.json file.See https://code.visualstudio.com/docs/remote/containers for details.

PyCharm (Professional)

Enable Docker support and use the Services tool window to build and manage images as well asrun and interact with containers.See https://www.jetbrains.com/help/pycharm/docker.html for details.

Option 4: using Gitpod#

Gitpod is an open-source platform that automatically creates the correct developmentenvironment right in your browser, reducing the need to install local developmentenvironments and deal with incompatible dependencies.

If you are a Windows user, unfamiliar with using the command line or building pandasfor the first time, it is often faster to build with Gitpod. Here are the in-depth instructionsfor building pandas with GitPod.

Step 3: build and install pandas#

There are currently two supported ways of building pandas, pip/meson and setuptools(setup.py).Historically, pandas has only supported using setuptools to build pandas. However, this methodrequires a lot of convoluted code in setup.py and also has many issues in compiling pandas in paralleldue to limitations in setuptools.

The newer build system, invokes the meson backend through pip (via a PEP 517 build).It automatically uses all available cores on your CPU, and also avoids the need for manual rebuilds byrebuilding automatically whenever pandas is imported (with an editable install).

For these reasons, you should compile pandas with meson.Because the meson build system is newer, you may find bugs/minor issues as it matures. You can report these bugshere.

To compile pandas with meson, run:

# Build and install pandas# By default, this will print verbose output# showing the "rebuild" taking place on import (see section below for explanation)# If you do not want to see this, omit everything after --no-build-isolationpython -m pip install -ve . --no-build-isolation --config-settings editable-verbose=true

Note

The version number is pulled from the latest repository tag. Be sure to fetch the latest tags from upstreambefore building:

# set the upstream repository, if not done already, and fetch the latest tagsgit remote add upstream https://github.com/pandas-dev/pandas.gitgit fetch upstream --tags

Build options

It is possible to pass options from the pip frontend to the meson backend if you would like to configure yourinstall. Occasionally, you’ll want to use this to adjust the build directory, and/or toggle debug/optimization levels.

You can pass a build directory to pandas by appending --config-settings builddir="your builddir here" to your pip command.This option allows you to configure where meson stores your built C extensions, and allows for fast rebuilds.

Sometimes, it might be useful to compile pandas with debugging symbols, when debugging C extensions.Appending --config-settings setup-args="-Ddebug=true" will do the trick.

With pip, it is possible to chain together multiple config settings (for example specifying both a build directoryand building with debug symbols would look like--config-settings builddir="your builddir here" --config-settings=setup-args="-Dbuildtype=debug".

Compiling pandas with setup.py

Note

This method of compiling pandas will be deprecated and removed very soon, as the meson backend matures.

To compile pandas with setuptools, run:

python setup.py develop

Note

If pandas is already installed (via meson), you have to uninstall it first:

python -m pip uninstall pandas

This is because python setup.py develop will not uninstall the loader script that meson-pythonuses to import the extension from the build folder, which may cause errors such as anFileNotFoundError to be raised.

Note

You will need to repeat this step each time the C extensions change, for exampleif you modified any file in pandas/_libs or if you did a fetch and merge from upstream/main.

Checking the build

At this point you should be able to import pandas from your locally built version:

$ python>>> import pandas>>> print(pandas.__version__) # note: the exact output may differ2.0.0.dev0+880.g2b9e661fbb.dirty

At this point you may want to tryrunning the test suite.

Keeping up to date with the latest build

When building pandas with meson, importing pandas will automatically trigger a rebuild, even when C/Cython files are modified.By default, no output will be produced by this rebuild (the import will just take longer). If you would like to see meson’soutput when importing pandas, you can set the environment variable MESONPY_EDTIABLE_VERBOSE. For example, this would be:

# On Linux/macOSMESONPY_EDITABLE_VERBOSE=1 python# Windowsset MESONPY_EDITABLE_VERBOSE=1 # Only need to set this once per sessionpython

If you would like to see this verbose output every time, you can set the editable-verbose config setting to true like so:

python -m pip install -ve . --config-settings editable-verbose=true

Tip

If you ever find yourself wondering whether setuptools or meson was used to build your pandas,you can check the value of pandas._built_with_meson, which will be true if meson was usedto compile pandas.

Creating a development environment — pandas 2.2.1 documentation (2024)

FAQs

How to create a panda environment? ›

  1. Step 1: install a C compiler. How to do this will depend on your platform. If you choose to user Docker in the next step, then you can skip this step. ...
  2. Step 2: create an isolated environment. Before we begin, please: Make sure that you have cloned the repository. ...
  3. Step 3: build and install pandas. You can now run:

Does pandas have built-in documentation? ›

Maintains comprehensive documentation for users

You get everything from basic concepts and installation to advanced features and best practices. The documentation is also continually expanded, with Pandas 2.2. 0. as the latest update.

How to install pandas in Docker? ›

pandas provides a DockerFile in the root directory to build a Docker image with a full pandas development environment. Build the Docker image: # Build the image docker build -t pandas-dev . You can use the DockerFile to launch a remote session with Visual Studio Code, a popular free IDE, using the .

Can pandas be used in C++? ›

To test out code changes, you'll need to build pandas from source, which requires a C/C++ compiler and Python environment.

What does a panda environment look like? ›

They once lived in lowland areas, but farming, forest clearing and other development now restrict giant pandas to the mountains. Giant pandas live in broadleaf and coniferous forests with a dense understory of bamboo, at elevations between 5,000 and 10,000 feet.

Is Panda a framework or library? ›

Pandas is a Python library for data analysis. Started by Wes McKinney in 2008 out of a need for a powerful and flexible quantitative analysis tool, pandas has grown into one of the most popular Python libraries.

What are the disadvantages of pandas? ›

Cons of Pandas: Performance Bottlenecks: Pandas can face performance challenges when dealing with large datasets. Operations may be slower compared to alternatives like Polars, particularly in scenarios where speed is critical.

Is pandas an API or library? ›

What is Pandas? As an open-source software library built on top of Python specifically for data manipulation and analysis, Pandas offers data structure and operations for powerful, flexible, and easy-to-use data analysis and manipulation.

How to install pandas in Python environment? ›

Here is the how-to to install Pandas for Windows:
  1. Install Python.
  2. Type in the command “pip install manager”
  3. Once finished, type the following: *pip install pandas* Wait for the downloads to be over and once it is done you will be able to run Pandas inside your Python programs on Windows.
Sep 16, 2020

How to install pandas in a conda environment? ›

Here's how:
  1. Make sure that the new environment is active by running the following command: conda activate myenv.
  2. Type the following command to install Pandas: conda install pandas. ...
  3. Verify that Pandas is installed by opening a Python interpreter and typing the following command: import pandas as pd.
Jun 19, 2023

How to import pandas in Python? ›

We can simply write 'import pandas' to make this module available. However, it is a good practice to use the as keyword to give it a shorthand name 'pd'. Still, this is not a requirement and you can simply write 'import pandas'.

Can I use pandas instead of SQL? ›

If you're working with smaller datasets or need more flexibility in data manipulation, Pandas is the way to go. If you're working with larger datasets or need more advanced aggregation and filtering capabilities, SQL is the way to go.

Is pandas written in C or Python? ›

pandas (software)
Original author(s)Wes McKinney
Written inPython, Cython, C
Operating systemCross-platform
TypeTechnical computing
LicenseNew BSD License
9 more rows

What can pandas do that SQL Cannot? ›

In Pandas, you can incrementally construct queries as you go along; in SQL, you cannot. In Pandas, operating on and naming intermediate results is easy; in SQL it is harder. In Pandas, it is easy to get a quick sense of the data; in SQL it is much harder. Pandas has native support for visualization; SQL does not.

What should be in a panda enclosure? ›

San Diego Zoo Wildlife Alliance conservation scientists have found that suitable panda habitat requires old-growth conifer forests with at least two types of bamboo and water access.

How can we help pandas habitat? ›

Community development projects such as providing wood saving stoves to limit the impact of wood-fuel harvesting on the panda's forests. Research and monitoring work, such as setting up infrared cameras to record panda movements in the Minshan and Qinling Mountains.

How much does it cost to house a panda? ›

Pandas cost about $500,000 to care for annually, according to Dennis Kelly, chief executive of Zoo Atlanta, one of four American zoos that houses pandas. The zoo's second most expensive animal, the elephant, require just one fifth as much.

What shelter does a panda need? ›

The giant panda does not hibernate but will shelter in caves or hollow trees in very cold weather.

References

Top Articles
Latest Posts
Article information

Author: Maia Crooks Jr

Last Updated:

Views: 5761

Rating: 4.2 / 5 (43 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Maia Crooks Jr

Birthday: 1997-09-21

Address: 93119 Joseph Street, Peggyfurt, NC 11582

Phone: +2983088926881

Job: Principal Design Liaison

Hobby: Web surfing, Skiing, role-playing games, Sketching, Polo, Sewing, Genealogy

Introduction: My name is Maia Crooks Jr, I am a homely, joyous, shiny, successful, hilarious, thoughtful, joyous person who loves writing and wants to share my knowledge and understanding with you.