rep2nb

from Python Packages

python

pip

ipynb

jupyter

repository

utility

package

Tye Phoenix

March 28, 2026

access_time

Reading Time: 4 minutes

I Built a Tool That Turns Any Python Repo Into an Executable Jupyter Notebook

Last week I was doing the Tower Research Capital x MIT Limestone Data Challenge — a multi-part quant problem with matrix completion, convex optimization, and trading strategy work. The kind of thing where you naturally end up with a real repo: multiple modules, shared utilities, pipeline scripts, cross-file imports, intermediate outputs, argparse, the whole deal.

Everything was clean. The repo ran perfectly.

Then I saw the submission requirement:

Submit a single .ipynb file.

So I did what everyone does: started manually copying files into notebook cells, reordering code, trying to inline imports, fixing execution order, and debugging all the dumb notebook-specific breakage that had nothing to do with the actual problem.

It sucked.

So I built a tool for it.

rep2nb

rep2nb is a pip-installable package that converts an entire Python repo into a single executable Jupyter notebook.

pip install rep2nb
rep2nb myproject/ -o submission.ipynb

If your repo runs, the notebook should too.

Why this is harder than it sounds

At first glance this seems easy: just grab every .py file and dump them into cells.

That works for toy repos. It breaks immediately on anything real.

The moment you have cross-file imports, scripts that depend on outputs from other scripts, if __name__ == "__main__" blocks, package structure, or multiple entry points, naive copy-paste falls apart.

rep2nb handles the parts that actually make this annoying.

What it does

1. Orders files correctly

If pipeline.py depends on coefficients.py, and coefficients.py depends on matrix.py, those need to execute in the right order in the notebook.

rep2nb parses the repo with Python’s AST, builds a dependency graph, and topologically sorts the files so dependencies run first.

2. Preserves imports across files

In a normal repo, Python’s module system handles imports for you. In a notebook, there are no real files anymore — just a flat execution environment.

rep2nb registers executed file contents into sys.modules, so import x, from x import y, relative imports, and package imports still resolve the way they did in the repo.

3. Handles entry points intelligently

Not every if __name__ == "__main__": block should be executed.

If a file is really a library module, unwrapping that block would run test code or side effects in the middle of the notebook. rep2nb distinguishes between true entry points and support modules. Entry points get unwrapped. Library modules get their guards stripped. You can also override this manually with --entry.

4. Splits multi-project repos into sections

The repo that motivated this had multiple independent sub-projects, each with its own pipeline and helper files.

rep2nb detects subdirectories that should behave like isolated notebook sections and handles them separately. That includes:

markdown headers for section boundaries
changing directories so relative paths still work
clearing sys.modules state between sections
cleaning up temp directories after execution

5. Fixes the annoying notebook edge cases

A lot of repo-to-notebook breakage comes from tiny environment assumptions. rep2nb patches the common ones automatically:

argparse scripts that choke on Jupyter kernel args
code that expects __file__ to exist
optional generation of a pip install cell from import analysis
extraction of module docstrings into markdown cells
automatic README inclusion as notebook header

Stress test

I tested rep2nb on the contest repo that originally annoyed me into building it.

That repo had:

12 Python files
3 independent sub-projects
cross-file imports
argparse-driven scripts
code relying on __file__
intermediate file handoffs
external dependencies like numpy, pandas, scipy, and torch

The generated notebook ran top to bottom without errors.

That was the bar: not “looks nice,” not “kind of works,” but actually executable on a repo that was annoying enough to justify making the tool in the first place.

What it does not handle

There are still limits, and I’d rather be explicit about them:

dynamic imports where module names are computed at runtime
subprocess-based Python execution that expects separate files
circular imports, which are detected and surfaced as errors
non-Python files, aside from listing them so you know what else needs to come along

Usage

CLI

rep2nb myproject/

rep2nb myproject/ \
  --entry pipeline.py \
  --exclude tests/ \
  --include-pip-install \
  -o submission.ipynb

rep2nb contest-repo/ \
  --entry problem-1/pipeline.py \
  --entry problem-2/main.py

Python API

from rep2nb import convert

convert(
    "myproject/",
    output="submission.ipynb",
    entry=["generate.py", "analyze.py"],
    exclude=["tests/"],
    include_pip_install=True,
)

Under the hood

The pipeline is roughly:

discover Python files
detect independent sections vs real packages
parse ASTs for imports, docstrings, definitions, and main guards
build dependency graphs
topologically sort execution order
rewrite imports / guards / runtime assumptions
assemble the final notebook with code cells, markdown cells, and cleanup logic

Get it

pip install rep2nb

GitHub: github.com/tyephoenix/rep2nb
PyPI: pypi.org/project/rep2nb

I built it because I got annoyed enough doing this by hand once.

Felt like there was a decent chance other people had the exact same problem.

Tye Phoenix