A rudimentary algorithm, with limited to no error handling or enhancements. The code executes the following steps:
- Receives the input word. For now, not treatment to this word is performed. It does a look-up for a dictionary in S3 that converts the word to its pronunciation using the CMU dictionary.
- It checks DynamoDB to see if the word has been searched before. This is a computationally cheap way to prevent repeat calculations. DynamoDB acts something like a cache.
- If the word has not be searched before, a list of idioms is pulled from S3. These idioms have been scraped from a few websites that list a few thousand idioms. The output is limited by the quality of these idioms and the subsequent step.
- Each idiom is converted to pronunciation form and then the distance between the input word and each word in the idiom is calculated.
- The shortest top 10 distances are outputted and the results are cached.