Feature request: Custom output filename #193

siddhantac · 2025-02-09T05:08:22Z

It would be great to have some way to have a deterministic filename for the output CSV file. This would help me automate my transaction import process.

I can think of 2 ways to do this:

specify an output file name (via CLI flag). If missing, monopoly can default to the generated filename that it currently uses.
dump CSV output to stdout (via CLI flag). This provides more flexibility as it can be piped into other apps & tools.

I'd be happy to contribute code if there is interest.

The text was updated successfully, but these errors were encountered:

benjamin-awd · 2025-02-18T16:02:39Z

Option 1 is a bit tricky, since monopoly is built to support multi-file/directory input

I'd be happy to review a PR for Option 2 - currently there's a pretty-print argument in the CLI that uses tabulate to generate a neatly formatted table, but I think it could be extended to optionally dump to CSV using the default csv module, since tabulate doesn't support it.

monopoly/src/monopoly/cli.py

Lines 143 to 146 in 8dfc42a

    
           if print_df: 
        
               pprint_transactions(transactions, statement, file) 
        
               # don't load to CSV if pprint 
        
               return None

siddhantac · 2025-02-24T02:13:01Z

If monopoly can differentiate between single-file input vs directory then the first option is possible. The flag would be considered invalid if the input is a directory.

For option 2, the challenge will be in differentiating multiple files. Since my main purpose of doing this was to pipe the output into other tools.

Perhaps a new column could be add with the filename. However, that means every row would have the same value for that column.

Example:

date,description,amount,filename
2025-02-24,Food,$23,2025_01_dbs.csv
2025-02-13,Rent,$1000,2025_01_dbs.csv
2025-02-17,Gym,$150,2025_01_dbs.csv

benjamin-awd · 2025-02-25T15:50:03Z

After some thought, I think it might actually be easier to run the library directly (or create some kind of wrapper over it) to give the level of control you're looking for. Would something like this work?

import csv
from pathlib import Path
from monopoly.banks import BankDetector, banks
from monopoly.pdf import PdfDocument
from monopoly.pipeline import PdfParser, Pipeline

def generate_parser(file_path: str):
    """Generates a parser using the input file path."""
    document = PdfDocument(file_path).unlock_document()
    analyzer = BankDetector(document)
    bank = analyzer.detect_bank(banks)
    parser = PdfParser(bank, document)
    return parser

def main():
    # alternatively, you could also use glob here to grab multiple statements at once
    input_file = "statements/dbs/dbs-2024-10.pdf"
    output_directory = Path("output")
    output_directory.mkdir(parents=True, exist_ok=True)
    output_path = output_directory / "my-statement.csv"
    
    parser = generate_parser(input_file)
    pipeline = Pipeline(parser)
    statement = pipeline.extract()
    transactions = pipeline.transform(statement)
    
    print(f"Writing CSV to file path: {output_path}")
    
    with open(output_path, mode="w", encoding="utf8", newline='') as file:
        writer = csv.writer(file)
        
        # Write header
        writer.writerow(statement.columns)
        
        for transaction in transactions:
            writer.writerow([transaction.date, transaction.description, transaction.amount])

if __name__ == "__main__":
    main()

benjamin-awd · 2025-02-25T16:00:15Z

Perhaps a new column could be add with the filename. However, that means every row would have the same value for that column

Unfortunately CSV is purely suited for handling metadata, I think a better solution here would be supporting some kind of JSON output with a metadata field - I think this would then be relatively easy to read/parse with something like jq

siddhantac changed the title ~~Custom output filename~~ Feature request: Custom output filename Feb 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Custom output filename #193

Feature request: Custom output filename #193

siddhantac commented Feb 9, 2025

benjamin-awd commented Feb 18, 2025 •

edited

Loading

siddhantac commented Feb 24, 2025

benjamin-awd commented Feb 25, 2025 •

edited

Loading

benjamin-awd commented Feb 25, 2025

Feature request: Custom output filename #193

Feature request: Custom output filename #193

Comments

siddhantac commented Feb 9, 2025

benjamin-awd commented Feb 18, 2025 • edited Loading

siddhantac commented Feb 24, 2025

benjamin-awd commented Feb 25, 2025 • edited Loading

benjamin-awd commented Feb 25, 2025

benjamin-awd commented Feb 18, 2025 •

edited

Loading

benjamin-awd commented Feb 25, 2025 •

edited

Loading