Variables

In mathematical expressions and equations, variables are used to represent unknown or changeable quantities.  They act as placeholders for undetermined or unspecified values.  Different unknowns are given different names.

• In algebra word problems, variables correspond to the number or amount of the "things" that are being discussed.
• In natural languages, the "things" that are discussed are represented as nouns and noun phrases...

Therefore, AutoMathic creates variables out of nouns and noun phrases.

Nouns and Noun Phrases

As AutoMathic reads its input, it identifies noun phrases and automatically selects variables to stand for them.  Identifying noun phrases involves seeing what remains when numbers, operators, and other function words have been identified and processed.

Nouns and simple noun phrases are almost always used as-is.  However, complex noun phrases that include restrictive clauses get simplified to focus on the restriction:

 Noun Phrase Result `cars that are red` `red` `scores which would be counted` `counted` `students who were absent` `absent`

For readability, AutoMathic tries to select variables that are good mnemonics for their noun phrases.  It uses the following strategy to try to select an unused variable for a noun phrase:

1. Try the lowercase form of the first letter of the first word (e.g.  "s" for "sample weight")
2. Try the uppercase form of the first letter of the first word (e.g.  "S" for "sample weight")

3. Repeat 1 - 2 for the remaining letters of the first word  (e.g.  "a", "A", "m", "M", etc.)
4. Repeat 1 - 3 for the remaining words (e.g.  "w", "W", "e", "E", etc.)

5. If all of those variable names are already used, find the first unused letter of the sequence "a-zA-Z".
• Since there are "only" 52 letters available for variable names, AutoMathic cannot handle problems that require more than 52 variables!  If it runs out of variable names, it will abort, saying that it "can't name" the new noun phrase.

Fuzzy-Matching

Before AutoMathic tries to select a new variable for a noun phrase, it must first determine if the noun phrase is actually referring to something new.  In informal language, the "same thing" can be referred to with different words or phrases.  The same variable should be used regardless of slight differences in wording, so it is important that AutoMathic recognize when different but similar noun phrases are really referring to the same thing.  It uses a few strategies to see if different noun phrases might refer to the same thing:

• Automatically matches singular and plural forms of words using the rules of the English language:
 Singular Plural car cars loss losses church churches ash ashes box boxes company companies

Words with alternate or often-misspelled plural forms (e.g. potato, potatoes, "potatos") are usually recognized and matched using fuzzy matching.

• Uses a "fuzzy-matching" technique to determine if any two noun phrases are similar enough to refer to the same thing.
Fuzzy matching is always used on noun-phrases entered by the user.  By default, fuzzy-matching is also used against noun-phrases in the mathematical facts library.  This optional feature may be toggled by setting the "b_fuzzy_facts" kernel option.

AutoMathic's fuzzy-matching technique compares the noun phrase to previous noun phrases and scores their similarity on a scale of 0% - 100%.  Identical phrases score 100%, and phrases with no letters in common score 0%.  Any differences are reflected with a score somewhere in-between.

Predefined (but customizable) thresholds determine how a score is interpreted:

• Scores below a certain minimum (e.g. 67%) automatically fail to match.  The minimum threshold can be altered by setting the "f_fuzzy_min" kernel option.
• Scores above a certain maximum (e.g. 86%) are guaranteed candidates for a match, and the best match is used automatically.  The maximum threshold can be altered by setting the "f_fuzzy_max" kernel option.
• Scores between the minimum and maximum are possible matches:
1. AutoMathic will check its Frequently-Asked-Questions file (faq.dat) to see if the user has ever answered this question before.  If so, the user's previous answer would be reused automatically.  If not...
2. AutoMathic will ask the user to decide on possible matches, presenting the most likely ones first.  If the "b_learn_mode" kernel option is "1", it will record the answers in the Frequently Asked Questions file, "faq.dat".

Irregular plurals (e.g. woman and women), minor misspellings, and minor typographical errors are usually similar enough to trigger some kind of a fuzzy match.

If different references to the same thing slip by and are not matched, AutoMathic can be explicitly told that they are the same with a statement of fact (e.g. "goose means geese").  This requires that the user pay close attention to AutoMathic's messages regarding variable creation.