Script 155: brand nonbrand
Purpose
The Python script categorizes marketing campaigns as either “Brand” or “Non-Brand” based on specific naming patterns.
To Elaborate
The script is designed to process marketing campaign data and classify each campaign as either “Brand” or “Non-Brand.” This classification is based on the presence of specific keywords within the campaign names. The script uses regular expressions to identify campaigns that contain the word “brand” but not “non-brand” or similar variations. The goal is to update the campaign data with this classification, ensuring that any changes in the categorization are reflected in the output data. This process helps in organizing and analyzing marketing strategies by distinguishing between brand-focused and non-brand-focused campaigns.
Walking Through the Code
- UUID Generation:
- The script defines a function
string_to_uuid
that converts an input string into a UUID using SHA-1 hashing. This is done using Python’suuid
library with a predefined namespace.
- The script defines a function
- Temporary Field Setup:
- A temporary field is created in the input DataFrame to store the new classification. This field is initially set to
NaN
to ensure it is blank before processing.
- A temporary field is created in the input DataFrame to store the new classification. This field is initially set to
- Regular Expression Compilation:
- A regular expression pattern is compiled to identify variations of the term “non-brand” in a case-insensitive manner. This pattern is used to differentiate between “Brand” and “Non-Brand” campaigns.
- Campaign Classification:
- The script classifies campaigns by checking if the campaign name contains the word “brand” but not the “non-brand” pattern. Campaigns meeting these criteria are labeled as “Brand,” while others are labeled as “Non-Brand.”
- Data Update and Filtering:
- The new classification is copied to the output DataFrame. The script filters the output to include only those campaigns where the classification has changed, ensuring that only relevant updates are reflected in the final output.
Vitals
- Script ID : 155
- Client ID / Customer ID: 1306922573 / 2
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Brand vs NonBrand
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Jonathan Reichl (jreichl@marinsoftware.com)
- Created by Jonathan Reichl on 2023-05-31 10:13
- Last Updated by Michael Huang on 2024-01-12 03:50
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_BRAND_VSNONBRAND = 'Brand vs NonBrand'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_BRAND_VSNONBRAND = 'Brand vs NonBrand'
#outputDf[BULK_COL_BRAND_VSNONBRAND] = "<<YOUR VALUE>>"
def string_to_uuid(input_string):
# Hash the input string using MD5 (you can use other hash functions too)
# hashed = hashlib.hashlib.md5(input_string.encode()).hexdigest()
import uuid
# Use a predefined namespace UUID, for example the DNS namespace
namespace_uuid = uuid.NAMESPACE_DNS
# Create a UUID using SHA-1 hashing
return str(uuid.uuid5(namespace_uuid, input_string))
input_str = "your_input_string_here"
resulting_uuid = string_to_uuid(input_str)
print(resulting_uuid)
TMP_FIELD = BULK_COL_BRAND_VSNONBRAND + '_new'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
#set up regex for non brand
pattern = r'non[\s-]?brand'
regex = re.compile(pattern, re.IGNORECASE)
today = datetime.datetime.now(CLIENT_TIMEZONE).date()
print(tableize(inputDf))
inputDf.loc[ (inputDf[RPT_COL_CAMPAIGN].str.contains('brand', case=False)) & (~inputDf[RPT_COL_CAMPAIGN].str.contains(regex)) , TMP_FIELD ] = 'Brand'
inputDf.loc[ (~inputDf[RPT_COL_CAMPAIGN].str.contains('brand', case=False)) | (inputDf[RPT_COL_CAMPAIGN].str.contains(regex)) , TMP_FIELD ] = 'Non-Brand'
print(tableize(inputDf))
print(inputDf.index.duplicated())
# copy new strategy to output
outputDf.loc[:,BULK_COL_BRAND_VSNONBRAND] = inputDf.loc[:, TMP_FIELD]
# only include campaigns with changed strategy in bulk file
outputDf = outputDf[ inputDf[TMP_FIELD].notnull() & (inputDf[BULK_COL_BRAND_VSNONBRAND] != inputDf[TMP_FIELD]) ]
Post generated on 2024-11-27 06:58:46 GMT