Script 1103: Script Group Country Code

Purpose

The script extracts the country code from a group name by identifying the segment before the first hyphen and assigns it to a ‘Country Code’ dimension.

To Elaborate

The Python script is designed to parse a group name string and extract the country code, which is defined as the segment appearing before the first hyphen (‘-‘). This extracted country code is then assigned to a new dimension labeled ‘Country Code’. The script includes special handling for group names starting with “BEFR” or “BENL”, where the country code is explicitly set to “BE”. If the extracted segment is not a valid two-character country code, the script assigns “N/A” to indicate that no valid country code was found. The script processes each group name in the input data, ensuring that any extra whitespace is removed from the group names before outputting the results.

Walking Through the Code

  1. Configurable Parameters:
    • The script begins by defining a configurable parameter PLACEMENT_KEY, which is set to the hyphen (‘-‘). This key is used to identify the position in the group name where the country code ends.
    • The primary data source is specified as inputDf, which is retrieved from a dictionary dataSourceDict.
  2. Function Definition:
    • A function get_country_code_from_group_name is defined to extract the country code from a given group name.
    • The function checks for special cases where the group name starts with “BEFR” or “BENL”, returning “BE” for these cases.
    • For other group names, a regular expression is used to find the segment before the first hyphen. If this segment is a two-character string, it is returned as the country code; otherwise, “N/A” is returned.
  3. Data Processing:
    • The script copies all rows from the input DataFrame inputDf to a new DataFrame outputDf.
    • It applies the get_country_code_from_group_name function to each group name in the input data to populate the ‘Country Code’ column in the output DataFrame.
    • Extra whitespace is removed from the group names in the output DataFrame.
  4. Output:
    • Finally, the script prints a tableized version of the output DataFrame, displaying the processed data with the extracted country codes.

Vitals

  • Script ID : 1103
  • Client ID / Customer ID: 1306927809 / 60270355
  • Action Type: Bulk Upload
  • Item Changed: AdGroup
  • Output Columns: Account, Campaign, Group, Country Code
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
  • Created by Grégory Pantaine on 2024-05-15 14:16
  • Last Updated by Grégory Pantaine on 2024-05-15 14:37
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
## name: Script - Group - Country Code
## description:
## Parse Group Name and pick out the country code into a dimension 'Country Code'.
## Country code appears before the first '-' in the group name.
## 
## Copied by Grégory Pantaine
## created: 2024-05-15

########### Configurable Params - START ##########
PLACEMENT_KEY = '-'

# Primary data source and columns
inputDf = dataSourceDict["1"]

# Output columns and initial values

RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_COUNTRY_CODE = 'Country Code'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_COUNTRY_CODE = 'Country Code'

# Function to extract country code from group name
def get_country_code_from_group_name(group_name):
    # Special cases for BEFR and BENL
    if group_name.startswith("BEFR") or group_name.startswith("BENL"):
        return "BE"
    
    # Regular expression pattern to match the country code before the first '-'
    regex_pattern = r"^([^-]+)"

    # Search for the country code using the pattern
    match = re.search(regex_pattern, group_name)
    if match:
        country_code = match.group(1).strip()
        if len(country_code) == 2:
            return country_code  # Return the matched country code
        else:
            return "N/A"
    else:
        return "N/A"  # Return "N/A" if no match is found

# Copy all input rows to output
outputDf = inputDf.copy()

# Extract country code from each group name
outputDf[BULK_COL_COUNTRY_CODE] = inputDf[RPT_COL_GROUP].apply(get_country_code_from_group_name)

# Remove extra whitespace from group names
outputDf[RPT_COL_GROUP] = outputDf[RPT_COL_GROUP].str.strip()

# Print the tableized version of the output DataFrame
print(tableize(outputDf))

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus