-
Notifications
You must be signed in to change notification settings - Fork 0
Merging google sheets logic #1 #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
see also issue #1 |
ctgraham
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions and some change requests.
make-batch-dirs
Outdated
| logger.addHandler(handler) | ||
| return logger | ||
|
|
||
| logger = logging # change as needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a global?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"logger" should be global so it can be used throughout the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the initiation of logger into the main function and made it a global variable so that it can be referenced in all functions.
make-batch-dirs
Outdated
| def create_batch_folder(scanning_path, batch_name): | ||
| batch_path = scanning_path+"/"+batch_name | ||
| logger.info(f"Creating Batch Path: {batch_path}") | ||
| print(f"Creating Batch Path: {batch_path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why output to STDOUT and to the logger with duplicate content? If this is important, wrap these calls in a helper function to keep the messages DRY.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a stopgap solution I did for figuring out what to do with the print statements in general. Ideally, log statements should be sufficient, but this script might be used within a pipeline, where logs would be better, or used as a cli tool where print statements (or stdout in general) would be more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed this here. This was done by adding a console logger along with a file logger that both are called at the same time. Might be able to enhance this more but would take a bit more looking into.
make-batch-dirs
Outdated
| return batch_path | ||
|
|
||
| def copy_xslx_to_batch(batch_path): | ||
| if os.path.isfile(batch_path+"/manifest.xlsx"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set the batch_path plus filename to a variable to keep this code DRY.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in Commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still very repetitive, but with notable (unintentional) distinctions:
if os.path.isfile(os.path.sep.join([batch_path,'manifest.xlsx'])):
logger.info(f"Copying spreadsheet to {batch_path}/manifest.xlsx")
shutil.copyfile(args.xls_file, os.path.sep.join(batch_path,'manifest.xlsx'))
else:
logger.error(f"Error: {batch_path}/manifest.xlsx does not exist.")
In the if conditional, the arguments are wrapped in an array; in the copyfile directive, the arguments are passed directly. In the logger calls, manual concatenation is done.
If the name manifest.xml ever changes, we have many, many edits to make.
Cleaner would be:
MANIFEST_FILE_PATH = os.path.sep.join([batch_path,'manifest.xlsx'])
if os.path.isfile(MANIFEST_FILE_PATH):
logger.info(f"Copying spreadsheet to {MANIFEST_FILE_PATH}")
shutil.copyfile(args.xls_file, MANIFEST_FILE_PATH)
else:
logger.error(f"Error: {MANIFEST_FILE_PATH} does not exist.")
Bonus, move 'manifest.xml' to a module level constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on cleaning this up a bit. I did miss a few of the os.path.sep.join statements which I've caught all of those up. Moving to the "Cleaner" version is on the TODO list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the "Cleaner" version commit update here.
make-batch-dirs
Outdated
| df = sheet.read() | ||
| batch_path = create_batch_folder(scanning_path, batch_name) | ||
| # Make sure the df has an 'id' column and data rows | ||
| if ('id' in df.columns): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check the existence of the id column and number of rows here, but not for the XLSX pathway? Could this be improved by generating the dataframe regardless of the source, and then calling the same logic to check the id, length, and make_dirs_from_df?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc dataframe is generated in both cases. The check for id column and other error checking can be put inside make_dirs_from_df and then there should be parity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error checking parity added 9880f0c
sheetutils.py
Outdated
| } | ||
|
|
||
| # Update the sheet with DataFrame contents | ||
| result = sheet.values().update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we happy with throwing all exceptions directly to the caller here? If so, this should be documented, and the caller should have a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handled in a49ae39
…ment, when before it was inherited from parent namespace
Remove specific file patterns from .gitignore.
Updated path handling to use os.path.sep for cross-platform compatibility.
Removed commented-out print statement and condition check related to debugging.
…sing into main function. Added a --log-file option.
Merging google sheets logic