Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
PsyInf Knowledge Base
Logo
PsyInf Knowledge Base

Knowledge Base Items

  • KBI0001: Security considerations for accessing shared datasets on multi-user machines
  • KBI0002: Represent a Dataverse dataset as a DataLad dataset
  • KBI0003: Capturing interactive computations in Jupyter Notebooks
  • KBI0004: Why do files move to .git/annex/objects when converting an existing directory into a DataLad dataset?
  • KBI0005: Drop a subdataset to speed up superdataset operations
  • KBI0006: How to fix-up a git-type special remote with a new location
  • KBI0007: Create a DataLad dataset from a published collection of files
  • KBI0008: Remove a dataset’s annex
  • KBI0009: How to re-ingest file content in a dataset clone
  • KBI0010: Cloning a dataset that exists in the form of an adjusted mode checkout
  • KBI0011: Debugging installation issues related to multiple Python versions
  • KBI0012: Annex encryption: fix passphrase prompt (GPG pinentry) not showing
  • KBI0013: Split a dataset without touching hosted data
  • KBI0014: Pushing tags with datalad push
  • KBI0015: datalad push vs git annex sync
  • KBI0016: Drop local files added in a specific commit
  • KBI0017: How to delete commits including their annexed data
  • KBI0018: Combining GitHub-like repositories and RIA stores
  • KBI0019: Workaround to an absent interactive authentication prompt
  • KBI0020: Installing DataLad and dependencies in Google Colab
  • KBI0021: Publishing a DataLad dataset to Zenodo
  • KBI0022: Performance of mv + datalad save vs git mv + git commit when renaming dataset directories
  • KBI0023: Keyring configuration
  • KBI0024: Create and update a 7-Zip archive in a RIA store
  • KBI0025: Saving changes in datasets owned by other users
  • KBI0026: Passing configurations
  • KBI0027: “Transfer already in progress, or unable to take transfer lock”
  • KBI0028: Create a DataLad dataset from Nextcloud (Sciebo) public share links
  • KBI0029: Create an empty RIA store
  • KBI0030: Stalling file retrieval from RIA stores due to missing 7z
  • KBI0031: Copy-Pasting on Windows
  • KBI0032: Pushing to RIA-store between OSs, and other improvements

Index overview

  • Index
Back to top
View this page

KBI0011: Debugging installation issues related to multiple Python versions¶

authors:

Michał Szczepanik <m.szczepanik@fz-juelich.de>

discussion:

https://github.com/psychoinformatics-de/knowledge-base/pull/43

keywords:

python version, package manager, pip, conda, MacOS, linux, Windows

software-versions:

datalad_0.18.3, python_3.9.2

It is not uncommon to have multiple Python versions installed on a single machine. It can be needed, for example, when using several software stacks with different requirements (some programs can even come with their own Python installation). Virtual environments and conda environments are two popular methods of managing multiple versions of Python packages and / or Python itself.

DataLad is a Python package which also provides a command line interface (datalad command). DataLad can be installed with different methods, including package managers such as apt, homebrew, conda, pip, or others, depending on user preferences. DataLad extensions are also distributed through different channels, but are most commonly installed using pip, the package installer for Python [1].

Like with any other command line tool, the actual executable used when calling the datalad command is determined by searching several locations in the order determined by the PATH variable. If DataLad is installed in several locations (e.g. different Python versions or multiple virtual environments), the installation used in a given moment will be dependent on current PATH (affected by system settings, user settings, and potentially modified by an active virtual environment).

In certain situations, the resolution order may lead to unexpected DataLad configurations, for example when the datalad command is provided in one Python installation, but pip comes from another, or if an older Python version takes precedence in PATH. In the latter case, a seemingly fresh DataLad install may provide an outdated version (DataLad roughly follows the Python release cycle; e.g., support for Python 3.6 has been dropped in DataLad 0.16).

This KBI presents examples of two such installation issues, and discusses simple and useful debugging steps.

Examples¶

Conda competing with Homebrew on MacOS¶

On MacOS, DataLad had been installed with homebrew. Conda base environment was activated by default. Datalad-Next had been installed with pip install. Problems were observed when cloning a datalad-annex URL (handling of which should be provided by DataLad-Next):

$ datalad clone "datalad-annex::(...)"
[WARNING] Requested extension 'next' is not available
install(error): (...)
[ERROR  ] Cannot handle URL 'datalad-annex::(...)'
$ python -m pip install datalad-next
Requirement already satisfied: datalad-next in /Users/jdoe/opt/anaconda3/lib/python3.9/site-packages (0.6.3)

In this instance, it turned out that although the conda environment was activated (and pip installed both DataLad and DataLad-next into that environment), the homebrew-installed DataLad was higher in the PATH. Because the user preferred working with that conda environment, the problem was resolved by uninstalling homebrew-DataLad and relying on conda to provide DataLad and DataLad-Next.

Conflicting Python versions on Windows¶

Python 3.11 was installed on a Windows machine where PsychoPy had been previously used. After installing Python 3.11, DataLad and DataLad-next were installed with pip install datalad datalad-next.

Problems manifested when trying to use the create-sibling-webdav command (provided by DataLad-Next):

$ datalad create-sibling-webdav ...
Traceback (most recent call last):
(...)
ModuleNotFoundError: No module named 'datalad.distributed.create_sibling_ghlike'

Inspecting the diagnostic output revealed that standalone Python 3.6 installed with PsychoPy in C:\Program Files\PsychoPy3 was higher in PATH then Python 3.11 from AppData\Local. Calling pip install installed packages under Python 3.6, causing installation of an outdated DataLad version, and a non-functional DataLad-Next. These are excerpts from a diagnostic output:

$ datalad wtf
# WTF
## datalad
  - version: 0.15.6
## extensions
  - next:
    - load_error: ModuleNotFoundError(No module named 'datalad.distributed.create_sibling_ghlike')
## python
  - implementation: CPython
  - version: 3.6.6

Because PsychoPy was not actively used on that machine, the issue was resolved by editing the PATH variable.

How to check DataLad installation¶

The simplest way to start is by checking if the reported DataLad version matches expectations.

$ datalad --version
datalad 0.18.3

Another useful check is to see where the datalad command is actually coming from. This can be done using which (Unix-like shells), where (Windows CMD), or Get-Command (PowerShell) command. The example below comes from a Debian system where DataLad was installed with apt system package manager:

$ which datalad
/usr/bin/datalad

The output changes after activating a virtual environment called “example” (done here with virtualenvwrapper) in which DataLad had been previously installed:

$ workon example
$ which datalad
/home/jdoe/.virtualenvs/example/bin/datalad

The which command (or its equivalents) can be used for any other program, including pip:

$ which pip
/home/jdoe/.virtualenvs/example/bin/

The most comprehensive information can be found in the output of the datalad wtf diagnostic command. The output has many sections, which can also be viewed individually with the --section argument. The most relevant sections in the context of installation are listed below.

DataLad version:

$ datalad wtf --section datalad
# WTF
## datalad
  - version: 0.18.3

Python version:

$ datalad wtf --section python
# WTF
## python
  - implementation: CPython
  - version: 3.9.2

Important environment variables, including PATH. Path is usually a single long line with components separated by :; in the output below they have been moved to separate lines for readability. Note that since this was done in the virtual environment mentioned above, the virtual environment path is first on the list, and takes precedence over /usr/bin where the installation available outside the virtualenv is located.

$ datalad wtf --section environment
# WTF
## environment
  - LANG: en_US.UTF-8
  - LANGUAGE: en_US.UTF-8
  - LC_ALL: en_US.UTF-8
  - LC_CTYPE: en_US.UTF-8
  - PATH:
    /home/jdoe/.virtualenvs/example/bin:
    /usr/local/sbin:
    /usr/local/bin:
    /usr/sbin:
    /usr/bin:
    (...)
    /home/jdoe/texlive/2022/bin/x86_64-linux:

Finally, the list of extensions (available in the currently used installation):

$ datalad wtf --section extensions
# WTF
## extensions
  (...)
  - next:
    - description: What is next in DataLad
    - entrypoints:

      (...)

    - load_error: None
    - module: datalad_next
    - version: 1.0.0b2
 (...)

The Datalad Handbook has a nice and more general section on inspecting errors and reading the diagnostic output: How to get help.

Footnotes

[1]

The name pip originated as an acronym and declaration: pip installs packages.

Next
KBI0012: Annex encryption: fix passphrase prompt (GPG pinentry) not showing
Previous
KBI0010: Cloning a dataset that exists in the form of an adjusted mode checkout
Copyright © 2023, PsyInf group; licensed under CC BY 4.0, https://creativecommons.org/licenses/by/4.0/
Made with Sphinx and @pradyunsg's Furo
On this page
  • KBI0011: Debugging installation issues related to multiple Python versions
    • Examples
      • Conda competing with Homebrew on MacOS
      • Conflicting Python versions on Windows
    • How to check DataLad installation