Working with Jupyter Notebooks and Python Modules

Python OS Module Overview

The os module in Python provides various functions to interact with the operating system. Here are some key functions:

  • os.getcwd() – Get the current working directory.
  • os.chdir(path) – Change the current working directory to the specified path.
  • os.listdir(path) – List all files and directories in the specified directory.
  • os.mkdir(path) – Create a new directory.
  • os.rmdir(path) – Remove the specified directory.
  • os.remove(path) – Delete a file at the specified path.
  • os.rename(src, dst) – Rename a file or directory from src to dst.
  • os.path.exists(path) – Check if the specified path exists.
  • os.path.isfile(path) – Verify if the given path is a file.
  • os.path.isdir(path) – Check if the specified path is a directory.
  • os.path.join() – Combine multiple path components into a single path using the correct path separator for the operating system. This avoids manual concatenation errors and ensures cross-platform compatibility.

Example:

path1 = "C:\\Users"
path2 = "Documents"
full_path = os.path.join(path1, path2)
print(full_path)  # Outputs: C:\Users\Documents

Command Line Argument Parsing

Use argparse to handle input arguments in a script:

import argparse

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Example script')
    parser.add_argument('--input_path', default="input", type=str, help='Input files directory')
    parser.add_argument('--output_path', default="output", type=str, help='Output directory')
    args = parser.parse_args()
    run(args)

Jupyter Notebook Best Practices

  • Restart: Clears all loaded modules and variables. Useful when module changes need to be reloaded.
  • Run All: Executes all cells in order.
  • Clear All Output: Useful when debugging to remove visual clutter from previous outputs.

Common Jupyter Notebook Issues

  1. TqdmWarning: If you see IProgress not found, install ipywidgets: ``` pip install ipywidgets

  2. PyArrow dependency: Installing pyarrow increases the size of a pandas installation. For constrained environments, consider excluding it unless needed: ``` pip install pyarrow

    
    

Python Module Import Issues

If you have a file structure where train.py needs to import from a dataset.py file in a different directory, use the sys.path method to add the directory path:

import sys
sys.path.append('/path/to/dataset_directory')
import dataset

PyTorch Tips and Tricks

  • criterion may throw a error if both output and target are empty. Ensure valid input is provided.
  • Assignment with torch.tensor creates a reference, while list assignment creates a copy.
  • Use torch.flip for reverse indexing operations.
  • Specify data type when creating tensors: torch.tensor(data, dtype=torch.float32).
  • The item() functon retrieves the value of a scalar tensor.
  • Convert a tensor to a list with tensor.tolist() for compatibility with plotting libraries like matplotlib.

Model Summary with torchsummary

torchsummary is a utility library for inspecting PyTorch model architecture. It displays layer names, input/output shapes, and parameter counts, which is helpful for debugging and understanding model structure.

Tags: os-module argparse jupyter-notebook pytorch torchsummary

Posted on Wed, 27 May 2026 19:22:00 +0000 by danlindley