Blog 10: Regular Expressions in Python

Regular Expressions in Python for Hackers

Regular Expressions in Python – Pattern Matching & Data Extraction

Regular expressions (regex) allow hackers and developers to search, match, and manipulate strings based on specific patterns. Python’s built-in re module is a powerful tool that can help in CTFs, log analysis, web scraping, and custom data filters.

Getting Started with the re Module

import re

# Basic match
pattern = r"hacker"
text = "python for hackers"
match = re.search(pattern, text)
if match:
    print("Match found!")

Common Regex Functions

  • re.search() – searches for first match
  • re.findall() – returns all matching substrings
  • re.match() – checks for a match at the beginning
  • re.sub() – replaces matched substrings

Extracting All Matches

text = "Email me at vaibhav@example.com or test123@hackmail.com"
pattern = r"[\w.-]+@[\w.-]+"
emails = re.findall(pattern, text)
print(emails)

Replacing Sensitive Data

log = "User: root, Password: 123456"
pattern = r"Password: \w+"
clean_log = re.sub(pattern, "Password: ****", log)
print(clean_log)

Regex Meta-Characters

  • . – any character
  • \d – digit
  • \w – word character (a-z, A-Z, 0-9, _)
  • \s – whitespace
  • + – one or more
  • * – zero or more
  • ? – optional
  • [] – character set
  • () – capture group

Capture Groups Example

text = "IP: 192.168.1.1"
pattern = r"IP: (\d+\.\d+\.\d+\.\d+)"
match = re.search(pattern, text)
if match:
    print("Captured IP:", match.group(1))

Use Cases for Hackers

  • Log parsing for credentials/IPs
  • Extracting tokens or secrets from web content
  • Brute-force automation for credential leaks
  • Scraping specific patterns (emails, phone numbers, IPs)

Conclusion

Regex is an essential skill in any hacker’s toolbox. Mastering the re module helps you automate detection, extraction, and redaction of data — making your scripts powerful and efficient. Keep experimenting and building!

Comments