Script 509: Memira NO Offline Import

Purpose:

The Python script processes and merges data from two sources to generate a structured dataset for offline import, ensuring URLs are correctly assigned based on specific conditions.

To Elaborate

The Python script is designed to handle data integration and transformation tasks for offline import processes. It takes data from two sources, merges them based on specific columns, and then performs a series of transformations to ensure the data is in the correct format for further use. A key aspect of the script is its handling of URL data, where it checks for specific parameters within URLs and assigns a final URL based on these conditions. This ensures that the data is consistent and meets the requirements for structured budget allocation (SBA) processes. The script also includes steps to handle missing data by replacing NaN values with empty strings, ensuring that the final dataset is clean and ready for import.

Walking Through the Code

Data Import and Initial Setup
- The script begins by importing data from two sources into dataframes: inputDf and gSheetsDf.
- These dataframes are merged based on the ‘Ad’ and ‘Keyword’ columns, using a left join to ensure all entries from gSheetsDf are retained.
Data Transformation
- The merged dataframe is renamed to align with predefined column names, ensuring consistency in the dataset.
- The script checks for NaN values in the ‘Destination URL’ and ‘Suggested URL’ columns, replacing them with empty strings to prevent errors in subsequent operations.
URL Handling
- The script identifies URLs containing the ‘utm_marin’ parameter in both ‘Destination URL’ and ‘Suggested URL’ columns.
- Based on the presence of this parameter, it assigns a ‘FinalDestURL’, defaulting to a specific URL if neither contains the parameter.
Final Adjustments and Output
- The ‘FinalDestURL’ column is renamed to ‘Custom Parameters’ to match the expected output format.
- Any remaining NaN values in the ‘Custom Parameters’ column are replaced with empty strings.
- The final dataframe, outputDf, is prepared for output, ensuring it is ready for the next steps in the offline import process.

Vitals

Script ID : 509
Client ID / Customer ID: 1306925577 / 60269545
Action Type: Bulk Upload
Item Changed: Ad
Output Columns: Account, Campaign, Group, Creative ID, Media sub-type, Media Type, Display URL, Custom Parameters, Unique ID
Linked Datasource: M1 Report
Reference Datasource: Google Sheets
Owner: Tom McCaughey (tmccaughey@marinsoftware.com)
Created by Tom McCaughey on 2023-11-07 16:14
Last Updated by Tom McCaughey on 2024-05-03 08:29

> See it in Action

Python Code

#Offline format script
# Define the column names as constants
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_CREATIVE_ID = 'Creative ID'
BULK_COL_MEDIA_SUBTYPE = 'Media sub-type'
BULK_COL_MEDIA_TYPE = 'Media Type'
BULK_COL_DISPLAY_URL = 'Display URL'
BULK_COL_CUSTOM_PARAMETERS = 'Custom Parameters'
BULK_COL_UNIQUE_ID = 'Unique ID'

inputDf = dataSourceDict["1"]
gSheetsDf = dataSourceDict["2_1"]  # gsheets dataframe (first sheet)

# Merge dataframes based on 'Ad' and 'Keyword'
mergedDf = pd.merge(gSheetsDf, inputDf, left_on='F', right_on='Keyword', how='left')

mergedDf = mergedDf.rename(columns={
    'A': BULK_COL_UNIQUE_ID,
    'B': BULK_COL_CAMPAIGN,
    'E': BULK_COL_GROUP,
    'F': BULK_COL_CREATIVE_ID,
    'D': BULK_COL_MEDIA_SUBTYPE,
    'C': BULK_COL_MEDIA_TYPE,
    'G': BULK_COL_DISPLAY_URL
})

# If you want to replace NaN values with an empty string in the URL columns
mergedDf['Destination URL'].fillna('', inplace=True)
mergedDf['Suggested URL'].fillna('', inplace=True)

DestURL_contains_utm_marin = mergedDf['Destination URL'].str.contains('utm_marin', case=False, na=False)
SuggestURL_contains_utm_marin = mergedDf['Suggested URL'].str.contains('utm_marin', case=False, na=False)

# Set FinalDestURL based on conditions
mergedDf['FinalDestURL'] = np.where(DestURL_contains_utm_marin, mergedDf['Destination URL'],
                                    np.where(SuggestURL_contains_utm_marin, mergedDf['Suggested URL'], 'https://www.memira.se/'))

# Rename the FinalDestURL column to BULK_COL_CUSTOM_PARAMETERS
mergedDf = mergedDf.rename(columns={'FinalDestURL': BULK_COL_CUSTOM_PARAMETERS})

# If you want to replace NaN values with an empty string in the BULK_COL_CUSTOM_PARAMETERS column
mergedDf[BULK_COL_CUSTOM_PARAMETERS].fillna('', inplace=True)

outputDf = mergedDf

print(tableize(outputDf))

Post generated on 2025-03-11 01:25:51 GMT

07 Nov 2023

« Script 507: Memira No Email import Script 511: Auto Pause & Re enable »

MarinOne Scripts Creator's Corner