Script 357: gsheet example

Purpose

This Python script is used to merge data from a primary data source and a reference data source, and then generate an output dataframe.

To Elaborate

The problem being solved by this Python script is to merge data from two different data sources, specifically a primary data source and a reference data source. The primary data source is represented by the variable inputDf, and the reference data source is represented by the variable gSheetsDf. The script merges these two data sources based on a common column, ‘Campaign’, and creates a new dataframe called temp_df. From this merged dataframe, the script selects specific columns (‘Account’, ‘Campaign’, and ‘B’) and renames one of the columns to ‘Campaign Priority’, creating the output dataframe outputDf. Finally, the script generates an email summary prompt that includes the total count by Campaign Priority for the dataset in outputDf.

Walking Through the Code

  1. The script starts by importing the necessary libraries and defining the variable today as the current date.
  2. The primary data source is assigned to the variable inputDf using the key “1” from the dataSourceDict dictionary.
  3. The reference data source is assigned to the variable gSheetsDf using the key “2_1” from the dataSourceDict dictionary. This represents the first sheet of a Google Sheets document.
  4. The script merges the inputDf and gSheetsDf dataframes based on the ‘Campaign’ column, using an inner join. The merged dataframe is assigned to the variable temp_df.
  5. From the temp_df dataframe, the script selects the columns ‘Account’, ‘Campaign’, and ‘B’ and renames the ‘B’ column to ‘Campaign Priority’. The resulting dataframe is assigned to the variable outputDf.
  6. The script prints the string “abc”.
  7. The script generates an email summary prompt using a formatted string, including the total count by Campaign Priority for the dataset in outputDf.

Vitals

  • Script ID : 357
  • Client ID / Customer ID: 297978546 / 2
  • Action Type: Email Report
  • Item Changed: None
  • Output Columns:
  • Linked Datasource: M1 Report
  • Reference Datasource: Google Sheets
  • Owner: Mingxia Wu (mwu@marinsoftware.com)
  • Created by Mingxia Wu on 2023-10-16 10:38
  • Last Updated by Mingxia Wu on 2024-01-24 02:52
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
##
## name: gsheet_example
## description:
##  This is an example for google sheet reference
## 
## author: undefined
## created: 2023-10-16
## 

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
# reference data source and columns
gSheetsDf = dataSourceDict["2_1"]  # gsheets dataframe (first sheet)
# To access 10th row of Column C, use gSheetsDf.loc[10, 'C']

temp_df = pd.merge(inputDf, gSheetsDf, left_on=['Campaign'], right_on=['A'], how='inner')

outputDf=temp_df[['Account','Campaign','B']].rename(columns={'B': 'Campaign Priority'})
print("abc")

emailSummaryPrompt=f"""List the total count by Campaign Priority for below dataset
{outputDf.to_string()}
"""

Post generated on 2024-05-15 07:44:05 GMT

comments powered by Disqus