Script 1765: Offline Import Memira
Purpose:
The Python script merges and processes data from two sources to generate a structured dataset for advertising campaigns.
To Elaborate
The Python script is designed to merge and process data from two different sources, specifically a Google Sheets document and another data source, to create a structured dataset for advertising campaigns. The script aligns data based on specific columns, renames columns for clarity, and processes URL fields to ensure they contain necessary tracking parameters. If certain conditions are met, it sets a default URL. The final output is a cleaned and organized dataset that can be used for further analysis or reporting on advertising campaigns.
Walking Through the Code
- Data Import and Initial Setup
- The script begins by importing data from two sources: a Google Sheets document and another unspecified data source. These are stored in
gSheetsDf
andinputDf
respectively.
- The script begins by importing data from two sources: a Google Sheets document and another unspecified data source. These are stored in
- Data Merging
- The script merges these two dataframes on specific columns, ‘Ad’ and ‘Keyword’, to align related data from both sources.
- Column Renaming
- After merging, the script renames several columns to more descriptive names using predefined constants, which helps in maintaining clarity and consistency in the dataset.
- URL Processing
- The script checks for the presence of a specific tracking parameter (
utm_marin
) in two URL columns. Based on this check, it sets a final destination URL, defaulting to a specific URL if the parameter is not found.
- The script checks for the presence of a specific tracking parameter (
- Final Adjustments
- The script renames the column containing the final destination URL to a constant name for custom parameters and ensures no missing values in this column by replacing NaNs with empty strings.
- Output
- The processed dataframe is stored in
outputDf
and printed in a tabular format for review or further use.
- The processed dataframe is stored in
Vitals
- Script ID : 1765
- Client ID / Customer ID: 1306925581 / 60269545
- Action Type: Bulk Upload (Preview)
- Item Changed: Ad
- Output Columns: Account, Campaign, Group, Creative ID, Unique ID, Display URL, Media sub-type, Media Type
- Linked Datasource: M1 Report
- Reference Datasource: Google Sheets
- Owner: Jesus Garza (jgarza@marinsoftware.com)
- Created by Jesus Garza on 2025-02-26 03:28
- Last Updated by Jesus Garza on 2025-02-26 03:48
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#Offline format script
# Define the column names as constants
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_CREATIVE_ID = 'Creative ID'
BULK_COL_MEDIA_SUBTYPE = 'Media sub-type'
BULK_COL_MEDIA_TYPE = 'Media Type'
BULK_COL_DISPLAY_URL = 'Display URL'
BULK_COL_CUSTOM_PARAMETERS = 'Custom Parameters'
BULK_COL_UNIQUE_ID = 'Unique ID'
inputDf = dataSourceDict["1"]
gSheetsDf = dataSourceDict["2_1"] # gsheets dataframe (first sheet)
# Merge dataframes based on 'Ad' and 'Keyword'
mergedDf = pd.merge(gSheetsDf, inputDf, left_on='F', right_on='Keyword', how='left')
mergedDf = mergedDf.rename(columns={
'A': BULK_COL_UNIQUE_ID,
'B': BULK_COL_CAMPAIGN,
'E': BULK_COL_GROUP,
'F': BULK_COL_CREATIVE_ID,
'D': BULK_COL_MEDIA_SUBTYPE,
'C': BULK_COL_MEDIA_TYPE,
'G': BULK_COL_DISPLAY_URL
})
# If you want to replace NaN values with an empty string in the URL columns
mergedDf['Destination URL'].fillna('', inplace=True)
mergedDf['Suggested URL'].fillna('', inplace=True)
DestURL_contains_utm_marin = mergedDf['Destination URL'].str.contains('utm_marin', case=False, na=False)
SuggestURL_contains_utm_marin = mergedDf['Suggested URL'].str.contains('utm_marin', case=False, na=False)
# Set FinalDestURL based on conditions
mergedDf['FinalDestURL'] = np.where(DestURL_contains_utm_marin, mergedDf['Destination URL'],
np.where(SuggestURL_contains_utm_marin, mergedDf['Suggested URL'], 'https://www.memira.nl/'))
# Rename the FinalDestURL column to BULK_COL_CUSTOM_PARAMETERS
mergedDf = mergedDf.rename(columns={'FinalDestURL': BULK_COL_CUSTOM_PARAMETERS})
# If you want to replace NaN values with an empty string in the BULK_COL_CUSTOM_PARAMETERS column
mergedDf[BULK_COL_CUSTOM_PARAMETERS].fillna('', inplace=True)
outputDf = mergedDf
print(tableize(outputDf))
Post generated on 2025-03-11 01:25:51 GMT