Script 415: Offline Import Memira
Purpose
The Python script processes and merges data from two sources to generate a structured dataset for advertising campaigns, ensuring URLs are correctly formatted and assigned.
To Elaborate
The script is designed to handle and process data related to advertising campaigns by merging information from two different data sources. It focuses on aligning and structuring the data to ensure that each campaign entry has the correct URL parameters. The script checks for specific URL tracking parameters and assigns a default URL if these parameters are not present. This ensures that all advertising entries have a valid destination URL, which is crucial for tracking and analytics purposes. The script also renames columns to match a predefined structure, facilitating further processing or analysis.
Walking Through the Code
- Data Loading and Merging
- The script begins by loading data from two sources into dataframes,
inputDf
andgSheetsDf
. - It merges these dataframes on specific columns (‘Ad’ and ‘Keyword’), ensuring that related data from both sources are combined into a single dataframe,
mergedDf
.
- The script begins by loading data from two sources into dataframes,
- Column Renaming and Data Cleaning
- The merged dataframe’s columns are renamed to match predefined constants, ensuring consistency in column naming.
- The script fills any missing values in the ‘Destination URL’ and ‘Suggested URL’ columns with empty strings to prevent errors in subsequent operations.
- URL Processing and Assignment
- The script checks if the ‘Destination URL’ or ‘Suggested URL’ contains the ‘utm_marin’ parameter.
- Based on this check, it assigns a ‘FinalDestURL’ to each entry, defaulting to ‘https://www.memira.se/’ if neither URL contains the parameter.
- The ‘FinalDestURL’ column is then renamed to match a predefined constant for custom parameters.
- Final Output Preparation
- The script ensures that any missing values in the ‘BULK_COL_CUSTOM_PARAMETERS’ column are replaced with empty strings.
- The processed dataframe,
outputDf
, is prepared for output, and its contents are printed in a tabular format.
Vitals
- Script ID : 415
- Client ID / Customer ID: 1306925575 / 60269545
- Action Type: Bulk Upload
- Item Changed: Ad
- Output Columns: Account, Campaign, Group, Creative ID, Media sub-type, Media Type, Unique ID, Custom Parameters, Display URL
- Linked Datasource: M1 Report
- Reference Datasource: Google Sheets
- Owner: Tom McCaughey (tmccaughey@marinsoftware.com)
- Created by Tom McCaughey on 2023-10-25 15:50
- Last Updated by Tom McCaughey on 2024-01-10 14:24
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#Offline format script
# Define the column names as constants
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_CREATIVE_ID = 'Creative ID'
BULK_COL_MEDIA_SUBTYPE = 'Media sub-type'
BULK_COL_MEDIA_TYPE = 'Media Type'
BULK_COL_DISPLAY_URL = 'Display URL'
BULK_COL_CUSTOM_PARAMETERS = 'Custom Parameters'
BULK_COL_UNIQUE_ID = 'Unique ID'
inputDf = dataSourceDict["1"]
gSheetsDf = dataSourceDict["2_1"] # gsheets dataframe (first sheet)
# Merge dataframes based on 'Ad' and 'Keyword'
mergedDf = pd.merge(gSheetsDf, inputDf, left_on='F', right_on='Keyword', how='left')
mergedDf = mergedDf.rename(columns={
'A': BULK_COL_UNIQUE_ID,
'B': BULK_COL_CAMPAIGN,
'E': BULK_COL_GROUP,
'F': BULK_COL_CREATIVE_ID,
'D': BULK_COL_MEDIA_SUBTYPE,
'C': BULK_COL_MEDIA_TYPE,
'G': BULK_COL_DISPLAY_URL
})
# If you want to replace NaN values with an empty string in the URL columns
mergedDf['Destination URL'].fillna('', inplace=True)
mergedDf['Suggested URL'].fillna('', inplace=True)
DestURL_contains_utm_marin = mergedDf['Destination URL'].str.contains('utm_marin', case=False, na=False)
SuggestURL_contains_utm_marin = mergedDf['Suggested URL'].str.contains('utm_marin', case=False, na=False)
# Set FinalDestURL based on conditions
mergedDf['FinalDestURL'] = np.where(DestURL_contains_utm_marin, mergedDf['Destination URL'],
np.where(SuggestURL_contains_utm_marin, mergedDf['Suggested URL'], 'https://www.memira.se/'))
# Rename the FinalDestURL column to BULK_COL_CUSTOM_PARAMETERS
mergedDf = mergedDf.rename(columns={'FinalDestURL': BULK_COL_CUSTOM_PARAMETERS})
# If you want to replace NaN values with an empty string in the BULK_COL_CUSTOM_PARAMETERS column
mergedDf[BULK_COL_CUSTOM_PARAMETERS].fillna('', inplace=True)
outputDf = mergedDf
print(tableize(outputDf))
Post generated on 2024-11-27 06:58:46 GMT