Feature - Added new unit_test_writer by gavindeeppahl · Pull Request #104 · ONSdigital/rdsa-utils

gavindeeppahl · 2024-07-30T15:32:23Z

Description

This pr introduces a stand alone function which allows for basic unit test code creation, via utilisation of a simple config dict, and uses CSVs files as inputs. The output after running is a new .py file containing the script for the testing.

Improvements available on request:

I can add extensive commenting to the file if required
Currently the script outputs a generic test function at the end of the .py file, this can be improved if I have a template of what's preferable
The column_type override is currently limited to 'string' & 'float', this can be extended to other types
A different output directory can be added for the outputted .py file, currently it goes to the same location as the input files folder

Peer review

Any new code includes all the following:

Documentation: docstrings, comments have been added/ updated.
Style guidelines: New code conforms to the project's contribution guidelines.
Functionality: The code works as expected, handles expected edge cases and exceptions are handled appropriately.
Complexity: The code is not overly complex, logic has been split into appropriately sized functions, etc..
Test coverage: Unit tests cover essential functions for a reasonable range of inputs and conditions. Added and existing tests pass on my machine.

Review comments

Suggestions should be tailored to the code that you are reviewing. Provide context.
Be critical and clear, but not mean. Ask questions and set actions.

These might include:

bugs that need fixing (does it work as expected? and does it work with other code
that it is likely to interact with?)
alternative methods (could it be written more efficiently or with more clarity?)
documentation improvements (does the documentation reflect how the code actually works?)
additional tests that should be implemented
- Do the tests effectively assure that it
  works correctly? Are there additional edge cases/ negative tests to be considered?
code style improvements (could the code be written more clearly?)

Further reading: code review best practices

dombean · 2024-09-02T09:38:40Z

@gavindeeppahl & @AnneONS:

Here are a couple of my thoughts:

Refactor main() function to accept parameters instead of hard-coding them. This makes the function more flexible and reusable.
Use argparse for command-line arguments: Add argparse to handle command-line arguments, allowing users to specify their own parameters when running the script. I guess this is only necessary if you want to call from command line and not script.
Remove if __name__ == "__main__" command function and call it with their own arguments from another script.

I'm assuming you'd want to run with users calling own arguments from another script? So you can ignore argparse example.

import argparse
import logging
from pathlib import Path
import json


def main(csv_path: str, files: list, function_name: str, column_type_override: dict) -> None:
    """Initialise configuration and process CSV files for unit testing.

    This function sets up the configuration with paths, filenames, and function names,
    and then calls `process_dataframe` to handle the CSV files and generate the test
    code.

    Parameters
    ----------
    csv_path : str
        The path to the directory containing the CSV files.
    files : list
        A list of filenames to process.
    function_name : str
        The name of the function to generate tests for.
    column_type_override : dict
        A dictionary to override column types.

    Returns
    -------
    None
    """
    config = Config(
        csv_path=csv_path,
        files=files,
        function_name=function_name,
        column_type_override=column_type_override,
    )

    process_dataframe(config)

def run_from_command_line():
    parser = argparse.ArgumentParser(description="Process CSV files for unit testing.")
    parser.add_argument("--csv_path", type=str, required=True, help="Path to the CSV files directory.")
    parser.add_argument("--files", nargs='+', required=True, help="List of CSV filenames.")
    parser.add_argument("--function_name", type=str, required=True, help="Name of the function to generate tests for.")
    parser.add_argument("--column_type_override", type=str, required=True, help="Column type overrides in JSON format.")

    args = parser.parse_args()

    # Convert column_type_override from JSON string to dictionary
    column_type_override = json.loads(args.column_type_override)

    main(args.csv_path, args.files, args.function_name, column_type_override)

# Example usage:
# if __name__ == "__main__":
#     run_from_command_line()

Usage Instructions

Option 1: Running from the Command Line

Users can run the script from the command line with their own parameters:

python -m rdsa_utils.helpers.unit_test_writer --csv_path "path/to/csv" --files "input1.csv" "expected_output.csv" "fail_output.csv" --function_name "new_function" --column_type_override '{"string": ["period", "reference"], "float": ["602"]}'

Option 2: Creating a Custom Script

Users can create their own Python script to call the main() function with custom arguments:

from rdsa_utils.helpers.unit_test_writer import main

main(
    csv_path="path/to/csv",
    files=["input1.csv", "expected_output.csv", "fail_output.csv"],
    function_name="new_function",
    column_type_override={"string": ["period", "reference"], "float": ["602"]}
)

This approach provides flexibility and allows users to customise the parameters as needed.

rdsa_utils/helpers/unit_test_writer.py

AnneONS

Some initial comments, I'm now going to continue my review working inside VS Code :-)

rdsa_utils/helpers/unit_test_writer.py

AnneONS

A few more comments- mostly about how we identify the type from the csv. But otherwise I'm happy this is ready to go when we've addressed Dom's comments. I spoke to him about how to run it.

rdsa_utils/helpers/unit_test_writer.py

Feature-Added new unit_test_writer

9f4c15d

gavindeeppahl added the enhancement New feature or request label Jul 30, 2024

gavindeeppahl requested a review from AnneONS July 30, 2024 15:32

gavindeeppahl self-assigned this Jul 30, 2024

gavindeeppahl requested review from diego-ons and dombean as code owners July 30, 2024 15:32

Gavindeep Pahl added 3 commits July 30, 2024 16:38

Updated Changelog

b083cec

Feature - Added a unit test writer

688c7f4

Added indents

65c1ce2

dombean reviewed Sep 2, 2024

View reviewed changes

rdsa_utils/helpers/unit_test_writer.py Outdated Show resolved Hide resolved

AnneONS reviewed Sep 3, 2024

View reviewed changes

Add requested changes

961d1d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature - Added new unit_test_writer#104

Feature - Added new unit_test_writer#104
gavindeeppahl wants to merge 5 commits intodevelopmentfrom
feature-unittest-writer

gavindeeppahl commented Jul 30, 2024

Uh oh!

dombean commented Sep 2, 2024

Uh oh!

Uh oh!

AnneONS left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AnneONS left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gavindeeppahl commented Jul 30, 2024

Description

Peer review

Review comments

Uh oh!

dombean commented Sep 2, 2024

Here are a couple of my thoughts:

Usage Instructions

Option 1: Running from the Command Line

Option 2: Creating a Custom Script

Uh oh!

Uh oh!

AnneONS left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AnneONS left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants