What is it about?
When trying to identify specific types of values within a text corpus, one common approach is to use regular expressions (regexes). However, this normally requires writing a regex for each type. Instead, we leverage a large corpus of regular expressions from a regex playground as features to a machine learning algorithm to automatically identify regexes that can be used to classify types.
Featured Image
Photo by UX Indonesia on Unsplash
Read the Original
This page is a summary of: Learning from Uncurated Regular Expressions for Semantic Type Classification, June 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3596225.3596226.
You can read the full text:
Resources
Contributors
The following have contributed to this page