Script 415: Offline Import Memira

Purpose:

The Python script merges and processes data from two sources to generate a structured dataset for advertising campaigns.

To Elaborate

The Python script is designed to merge and process data from two different sources, specifically focusing on advertising campaign information. It takes data from a Google Sheets document and another data source, merging them based on specific columns related to advertisements and keywords. The script then renames certain columns to align with predefined constants, ensuring consistency in the dataset. It also handles missing values in URL columns by replacing them with empty strings and checks for specific URL parameters to determine the final destination URL. This processed data is then prepared for output, providing a structured dataset that can be used for further analysis or reporting on advertising campaigns.

Walking Through the Code

Data Input and Initialization
- The script begins by loading data from two sources into dataframes: inputDf from a generic data source and gSheetsDf from a Google Sheets document.
- These dataframes are merged based on the ‘Ad’ and ‘Keyword’ columns, using a left join to ensure all entries from the Google Sheets data are retained.
Column Renaming
- After merging, the script renames several columns to match predefined constants, ensuring that the dataset has a consistent structure and naming convention.
Handling Missing Values
- The script replaces any NaN values in the ‘Destination URL’ and ‘Suggested URL’ columns with empty strings to prevent errors in subsequent processing.
URL Processing
- It checks if the ‘Destination URL’ or ‘Suggested URL’ contains the parameter ‘utm_marin’. Based on this check, it sets the ‘FinalDestURL’ to either the ‘Destination URL’, ‘Suggested URL’, or a default URL (‘https://www.memira.se/’).
Final Adjustments
- The ‘FinalDestURL’ column is renamed to match the constant BULK_COL_CUSTOM_PARAMETERS.
- Any remaining NaN values in this column are replaced with empty strings to ensure data integrity.
Output Preparation
- The processed dataframe, outputDf, is prepared for output, and the results are printed in a tabular format using the tableize function.

Vitals

Script ID : 415
Client ID / Customer ID: 1306925575 / 60269545
Action Type: Bulk Upload
Item Changed: Ad
Output Columns: Account, Campaign, Group, Creative ID, Media sub-type, Media Type, Unique ID, Custom Parameters, Display URL
Linked Datasource: M1 Report
Reference Datasource: Google Sheets
Owner: Tom McCaughey (tmccaughey@marinsoftware.com)
Created by Tom McCaughey on 2023-10-25 15:50
Last Updated by Tom McCaughey on 2024-01-10 14:24

> See it in Action

Python Code

#Offline format script
# Define the column names as constants
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_CREATIVE_ID = 'Creative ID'
BULK_COL_MEDIA_SUBTYPE = 'Media sub-type'
BULK_COL_MEDIA_TYPE = 'Media Type'
BULK_COL_DISPLAY_URL = 'Display URL'
BULK_COL_CUSTOM_PARAMETERS = 'Custom Parameters'
BULK_COL_UNIQUE_ID = 'Unique ID'

inputDf = dataSourceDict["1"]
gSheetsDf = dataSourceDict["2_1"]  # gsheets dataframe (first sheet)

# Merge dataframes based on 'Ad' and 'Keyword'
mergedDf = pd.merge(gSheetsDf, inputDf, left_on='F', right_on='Keyword', how='left')

mergedDf = mergedDf.rename(columns={
    'A': BULK_COL_UNIQUE_ID,
    'B': BULK_COL_CAMPAIGN,
    'E': BULK_COL_GROUP,
    'F': BULK_COL_CREATIVE_ID,
    'D': BULK_COL_MEDIA_SUBTYPE,
    'C': BULK_COL_MEDIA_TYPE,
    'G': BULK_COL_DISPLAY_URL
})

# If you want to replace NaN values with an empty string in the URL columns
mergedDf['Destination URL'].fillna('', inplace=True)
mergedDf['Suggested URL'].fillna('', inplace=True)

DestURL_contains_utm_marin = mergedDf['Destination URL'].str.contains('utm_marin', case=False, na=False)
SuggestURL_contains_utm_marin = mergedDf['Suggested URL'].str.contains('utm_marin', case=False, na=False)

# Set FinalDestURL based on conditions
mergedDf['FinalDestURL'] = np.where(DestURL_contains_utm_marin, mergedDf['Destination URL'],
                                    np.where(SuggestURL_contains_utm_marin, mergedDf['Suggested URL'], 'https://www.memira.se/'))

# Rename the FinalDestURL column to BULK_COL_CUSTOM_PARAMETERS
mergedDf = mergedDf.rename(columns={'FinalDestURL': BULK_COL_CUSTOM_PARAMETERS})

# If you want to replace NaN values with an empty string in the BULK_COL_CUSTOM_PARAMETERS column
mergedDf[BULK_COL_CUSTOM_PARAMETERS].fillna('', inplace=True)

outputDf = mergedDf

print(tableize(outputDf))

Post generated on 2025-03-11 01:25:51 GMT

25 Oct 2023

« Script 409: New Launch Updates Round 2 Script 417: Campaign Benchmark Performance »

MarinOne Scripts Creator's Corner