Script 507: Memira No Email import

Purpose:

The Python script merges data from two sources, processes URLs, and prepares a structured output for further use.

To Elaborate

The Python script is designed to merge data from two different sources, specifically a Google Sheets dataframe and another input dataframe, based on specific columns. It then processes the merged data to ensure that URLs are correctly formatted and contain necessary tracking parameters. The script checks for the presence of ‘utm_marin’ in the URLs and sets a final destination URL based on certain conditions. The processed data is then prepared for output, ensuring that any missing values in key columns are handled appropriately. This script is particularly useful for managing and preparing marketing data for campaigns, ensuring that all necessary tracking information is included in the URLs.

Walking Through the Code

  1. Data Input and Merging
    • The script begins by loading data from two sources: a Google Sheets dataframe (gSheetsDf) and another input dataframe (inputDf).
    • It merges these dataframes on specific columns (‘Ad’ and ‘Keyword’) using a left join, which ensures that all entries from the Google Sheets dataframe are retained.
  2. Column Renaming
    • After merging, the script renames several columns to more descriptive names using predefined constants. This step is crucial for maintaining consistency and clarity in the data structure.
  3. URL Processing
    • The script checks if the ‘Destination URL’ and ‘Suggested URL’ columns contain the string ‘utm_marin’. This is important for tracking purposes in marketing campaigns.
    • Based on the presence of ‘utm_marin’, it sets a ‘FinalDestURL’ which is then renamed to ‘Custom Parameters’.
  4. Handling Missing Values
    • The script replaces any NaN values in the ‘Destination URL’, ‘Suggested URL’, and ‘Custom Parameters’ columns with empty strings to ensure data integrity.
  5. Output Preparation
    • Finally, the processed dataframe is prepared for output, and the results are printed in a tabular format using the tableize function.

Vitals

  • Script ID : 507
  • Client ID / Customer ID: 1306925577 / 60269545
  • Action Type: Bulk Upload
  • Item Changed: Ad
  • Output Columns: Account, Campaign, Group, Creative ID, Display URL, Media sub-type, Media Type, Status, Unique ID, Custom Parameters
  • Linked Datasource: M1 Report
  • Reference Datasource: Google Sheets
  • Owner: Tom McCaughey (tmccaughey@marinsoftware.com)
  • Created by Tom McCaughey on 2023-11-07 13:39
  • Last Updated by Tom McCaughey on 2024-01-10 14:23
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#Email format script# Define the column names as constants
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_CREATIVE_ID = 'Creative ID'
BULK_COL_MEDIA_SUBTYPE = 'Media sub-type'
BULK_COL_MEDIA_TYPE = 'Media Type'
BULK_COL_DISPLAY_URL = 'Display URL'
BULK_COL_CUSTOM_PARAMETERS = 'Custom Parameters'
BULK_COL_UNIQUE_ID = 'Unique ID'
BULK_COL_STATUS = 'Status'

inputDf = dataSourceDict["1"]
gSheetsDf = dataSourceDict["2_1"]  # gsheets dataframe (first sheet)

# Merge dataframes based on 'Ad' and 'Keyword'
mergedDf = pd.merge(gSheetsDf, inputDf, left_on='F', right_on='Keyword', how='left')

mergedDf = mergedDf.rename(columns={
    'A': BULK_COL_UNIQUE_ID,
    'B': BULK_COL_CAMPAIGN,
    'E': BULK_COL_GROUP,
    'F': BULK_COL_CREATIVE_ID,
    'D': BULK_COL_MEDIA_SUBTYPE,
    'C': BULK_COL_MEDIA_TYPE,
    'G': BULK_COL_DISPLAY_URL,
    'H': BULK_COL_STATUS
})

# If you want to replace NaN values with an empty string in the URL columns
mergedDf['Destination URL'].fillna('', inplace=True)
mergedDf['Suggested URL'].fillna('', inplace=True)

DestURL_contains_utm_marin = mergedDf['Destination URL'].str.contains('utm_marin', case=False, na=False)
SuggestURL_contains_utm_marin = mergedDf['Suggested URL'].str.contains('utm_marin', case=False, na=False)

# Set FinalDestURL based on conditions
mergedDf['FinalDestURL'] = np.where(DestURL_contains_utm_marin, mergedDf['Destination URL'],
                                    np.where(SuggestURL_contains_utm_marin, mergedDf['Suggested URL'], 'https://www.memira.no/'))

# Rename the FinalDestURL column to BULK_COL_CUSTOM_PARAMETERS
mergedDf = mergedDf.rename(columns={'FinalDestURL': BULK_COL_CUSTOM_PARAMETERS})

# If you want to replace NaN values with an empty string in the BULK_COL_CUSTOM_PARAMETERS column
mergedDf[BULK_COL_CUSTOM_PARAMETERS].fillna('', inplace=True)

outputDf = mergedDf

print(tableize(outputDf))

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus