Wordgames

Utilities

Enigmistics.clean_readFunction
clean_read(filename::AbstractString; newline_replace=" ")

Read a text file and return a string containing its contents.

The returned string is cleaned up to work better with the functions of the Enigmistics module. For example: multiple occurrences of newlines, spaces, or tabs get reduced to a single space (newlines information can be preserved by setting a different character in newline_replace).

Examples

julia> readlines("../texts/commedia.txt")[1:9] # simple read
9-element Vector{String}:
 "LA DIVINA COMMEDIA"
 "di Dante Alighieri"
 "INFERNO"
 ""
 ""
 ""
 "Inferno: Canto I"
 ""
 "  Nel mezzo del cammin di nostra vita"

julia> join(readlines(filename), " ")[1:107] # simple read + join
"LA DIVINA COMMEDIA di Dante Alighieri      INFERNO  Inferno: Canto I    Nel mezzo del cammin di nostra vita"

julia> clean_read(filename)[1:98] # clean read
"LA DIVINA COMMEDIA di Dante Alighieri INFERNO Inferno: Canto I Nel mezzo del cammin di nostra vita"

julia> clean_read(filename, newline_replace="/")[1:104] # clean read + newline replacement
"LA DIVINA COMMEDIA di Dante Alighieri / INFERNO / Inferno: Canto I / Nel mezzo del cammin di nostra vita"
source
Enigmistics.clean_textFunction
clean_text(s::AbstractString; newline_replace=" ")

Take a string and return a cleaned up version of it.

This allows to work better with the functions of the Enigmistics module. For example: multiple occurrences of newlines, spaces, or tabs get reduced to a single space (newlines information can be preserved by setting a different character in newline_replace).

It's the function on which clean_read relies on.

source
Enigmistics.count_lettersFunction
count_letters(s::AbstractString)
count_letters(s::Vector{AbstractString})

Count the number of alphabetic characters in a string.

Examples

julia> s = "This sentence has thirty-one letters";

julia> count_letters(s) # 31
31

julia> length(s) # 36 = 31 letters + 4 spaces + 1 hyphen
36
source
Enigmistics.snipFunction
snip(text::String, interval::UnitRange{Int}, pad=10)

Extract a snippet of text from a given range, enlarged on both sides by the given padding.

Examples

julia> text = "abcdefghijklmnopqrstuvwxyz";

julia> snip(text,13:14,0)
"mn"

julia> snip(text,13:14,2)
"klmnop"
source

Anagrams

Enigmistics.are_anagramsFunction
are_anagrams(s1::AbstractString, s2::AbstractString; be_strict=true)
are_anagrams(s1::Vector{String}, s2::Vector{String}; be_strict=true)

Check if two strings are anagrams, i.e. if the letters of one can be rearranged to form the other.

The parameter be_strict is defaulted to true to avoid considering as anagrams strings where one is simply a permutation of the other.

See also scan_for_anagrams.

Examples

julia> are_anagrams("The Morse Code","Here come dots!") # true anagram
true

julia> are_anagrams("The Morse Code","Morse: the code!") # just a fancy reordering
false

julia> are_anagrams("The Morse Code","Morse: the code!", be_strict=false) # just a fancy reordering, now considered an anagram
true
source
Enigmistics.scan_for_anagramsFunction
scan_for_anagrams(text::String; 
    min_length_letters=6, max_length_letters=30, max_distance_words=40, 
    be_strict=true, print_results=false)

Scan a text and look for pairs of word sequences which are anagrams.

Return a vector of matches in the form (range1, words1, range2, words2).

Arguments

  • text: the input text to scan
  • min_length_letters=6: consider only sequences of words with at least this number of letters
  • max_length_letters=30: consider only sequences of words with at most this number of letters
  • max_distance_words=40: consider only sequences of words which are at most these words apart
  • be_strict=true: do not consider as anagrams sequences where one is simply a permutation of the other
  • print_results=false: whether to print results or just return them

See also are_anagrams.

Examples

julia> text = "Last night I saw a gentleman; he was a really elegant man.";

julia> matches = scan_for_anagrams(text, min_length_letters=1, max_length_letters=14, max_distance_words=10, be_strict=false)
4-element Vector{Any}:
 (14:16, SubString{String}["saw"], 34:36, SubString{String}["was"])
 (14:18, SubString{String}["saw", "a"], 34:38, SubString{String}["was", "a"])
 (18:18, SubString{String}["a"], 38:38, SubString{String}["a"])
 (18:28, SubString{String}["a", "gentleman"], 47:57, SubString{String}["elegant", "man"])

julia> matches = scan_for_anagrams(text, min_length_letters=1, max_length_letters=14, max_distance_words=10, be_strict=true)
3-element Vector{Any}:
 (14:16, SubString{String}["saw"], 34:36, SubString{String}["was"])
 (14:18, SubString{String}["saw", "a"], 34:38, SubString{String}["was", "a"])
 (18:28, SubString{String}["a", "gentleman"], 47:57, SubString{String}["elegant", "man"])

julia> matches = scan_for_anagrams(text, min_length_letters=5, max_length_letters=14, max_distance_words=10)
1-element Vector{Any}:
 (18:28, SubString{String}["a", "gentleman"], 47:57, SubString{String}["elegant", "man"])
source

Pangrams

Enigmistics.is_pangramFunction
is_pangram(s::AbstractString; language="en", verbose=false)

Check if a string is a pangram, i.e. if it contains at least once all the letters of the alphabet.

The language parameter can be used to specify which alphabet to use (currently supported values are "en", English and "it", Italian). In particular, the english case consists of the vector collect('a':'z') while the italian one of the english vector having removed characters 'jwxyz'. If verbose is set to true the function will also inform, in case of non-pangrams, which were the missing letters.

See also scan_for_pangrams.

Examples

julia> is_pangram("The quick ORANGE fox jumps over the lazy dog", verbose=true)
[ Info: Missing character(s): ['b', 'w']
false

julia> is_pangram("The quick BROWN fox jumps over the lazy dog")
true

julia> is_pangram("Pranzo d'acqua fa volti sghembi") # false due to the default alphabet being english
false

julia> is_pangram("Pranzo d'acqua fa volti sghembi", language="it")
true
source
Enigmistics.scan_for_pangramsFunction
scan_for_pangrams(text::String; 
    max_length_letters=60, language="en", verbose=false, print_results=false)

Scan a text and look for sequences of words which are pangrams.

Return a vector of matches in the form (matching_range, matching_string).

Arguments

  • text: the input text to scan
  • max_length_letters=60: consider only sequences of words with at most this number of letters
  • language="en": language used to determine the alphabet
  • print_results=false: whether to print results or just return them

See also is_pangram.

Examples

julia> text = clean_read("../texts/paradise_lost.txt", newline_replace="/");

julia> scan_for_pangrams(text, max_length_letters=80, language="en")
1-element Vector{Any}:
 (21698:21804, "Grazed Ox, / JEHOVAH, who in one Night when he pass'd / From EGYPT marching, equal'd with one stroke / Both")

julia> text = clean_read("../texts/divina_commedia.txt", newline_replace="/"); 

julia> scan_for_pangrams(text, max_length_letters=50, language="it")
5-element Vector{Any}:
 (247196:247259, "aveste'. / E 'l buon maestro: "Questo cinghio sferza la colpa de")
 (247196:247262, "aveste'. / E 'l buon maestro: "Questo cinghio sferza la colpa de la")
 (247207:247270, "E 'l buon maestro: "Questo cinghio sferza la colpa de la invidia")
 (247210:247273, "l buon maestro: "Questo cinghio sferza la colpa de la invidia, e")
 (482359:482421, "probo. / Vidi la figlia di Latona incensa sanza quell'ombra che")
source

Palindromes

Enigmistics.is_palindromeFunction
is_palindrome(s::AbstractString)

Check if a string is palindrome, i.e. if it reads the same backward as forward.

See also scan_for_palindromes.

Examples

julia> is_palindrome("Oozy rat in a sanitary zoo")
true

julia> is_palindrome("Alle carte t'alleni nella tetra cella")
true
source
Enigmistics.scan_for_palindromesFunction
scan_for_palindromes(text::String; 
    min_length_letters=6, max_length_letters=30, print_results=false)

Scan a text and look for pairs of word sequences which are anagrams.

Return a vector of matches in the form (matching_range, matching_string).

Arguments

  • text: the input text to scan
  • min_length_letters=6: consider only sequences of words with at least this number of letters
  • max_length_letters=30: consider only sequences of words with at most this number of letters
  • print_results=false: whether to print results or just return them

See also is_palindrome.

Examples

julia> text = clean_read("../texts/ulyss.txt", newline_replace="/");

julia> scan_for_palindromes(text,  min_length_letters=10)
5-element Vector{Any}:
 (266056:266070, "Madam, I'm Adam")
 (266077:266101, "Able was I ere I saw Elba")
 (266082:266096, "was I ere I saw")
 (1120093:1120105, "Hohohohohohoh")
 (1424774:1424785, "tattarrattat")
source

Heterograms

Enigmistics.is_heterogramFunction
is_heterogram(s::AbstractString)

Check if a string is a heterogram, i.e. if it does not contain any repeated letter.

See also scan_for_heterograms.

Examples

julia> is_heterogram("unpredictable") # letter 'e' is repeated
false

julia> is_heterogram("unpredictably")
true

julia> is_heterogram("The big dwarf only jumps")
true
source
Enigmistics.scan_for_heterogramsFunction
scan_for_heterograms(text::String; 
    min_length_letters=10, print_results=false)

Scan a text and look for sequences of words which are heterograms.

Return a vector of matches in the form (matching_range, matching_string).

Arguments

  • text: the input text to scan
  • min_length_letters=10: consider only sequences of words with at most this number of letters
  • print_results=false: whether to print results or just return them

See also is_heterogram.

Examples

julia> text = clean_read("../texts/ulyss.txt", newline_replace="/");

julia> scan_for_heterograms(text, min_length_letters=15)
15-element Vector{Any}:
 (49809:49825, "and wheysour milk")
 (118616:118634, "a stocking: rumpled")
 (118618:118634, "stocking: rumpled")
 (332294:332312, "and forks? Might be")
 (424648:424664, "himself onward by")
 (478503:478519, "them quickly down")
 (498743:498759, "brought pad knife")
 (506800:506815, "rocky thumbnails")
 (518991:519009, "forms, a bulky with")
 (691836:691852, "with golden syrup")
 (707229:707245, "quickly and threw")
 (992565:992581, "THEM DOWN QUICKLY")
 (1349721:1349737, "and chrome tulips")
 (1442707:1442724, "myself go with and")
 (1442899:1442916, "with my legs round")
source

Lipograms

Enigmistics.is_lipogramFunction
is_lipogram(s::AbstractString, wrt::AbstractString)

Check if a string s is a lipogram with respect to the letters given by wrt, i.e. if s does not contain any letters from wrt.

The parameter wrt is a string so that single or multiple letters can be easily specified.

See also scan_for_lipograms.

Examples

julia> is_lipogram("If youth, throughout all history, had had a champion to stand up for it; to show 
       a doubting world that a child can think; and, possibly, do it practically; you wouldn’t constantly
       run across folks today who claim that “a child don’t know anything.” A child’s brain starts
       functioning at birth; and has, amongst its many infant convolutions, thousands of dormant
       atoms, into which God has put a mystic possibility for noticing an adult’s act, and figuring
       out its purport", "e", verbose=true)
true

julia> is_lipogram("The quick brown fox","abc",verbose=true) # found letters will be highlighted in bold magenta
[ Info: Letter(s) present from wrt: Set(['c', 'b'])
the quick brown fox
false
source
Enigmistics.scan_for_lipogramsFunction
scan_for_lipograms(text::String, wrt::String; 
    min_length_letters=30, max_length_letters=100, print_results=false)

Scan a text and look for sequences of words that are lipograms with respect to the letters given by wrt.

Return a vector of matches in the form (matching_range, matching_string).

Arguments

  • text: the input text to scan
  • min_length_letters=30: consider only sequences of words with at least this number of letters
  • max_length_letters=100: consider only sequences of words with at most this number of letters
  • print_results=false: whether to print results or just return them

See also is_lipogram.

Examples

julia> text = clean_read("../texts/all_shakespeare.txt", newline_replace="/");

julia> scan_for_lipograms(text, "eta", min_length_letters=34) # E, T and A are the most common letters in english
3-element Vector{Any}:
 (1480855:1480902, "up your sword. / NYM. Will you shog off? I would")
 (3647302:3647352, "in music. [Sings.] "Willow, willow, willow." / Moor")
 (3966201:3966245, "LORDS. Good morrow, Richmond! / RICHMOND. Cry")
source

Tautograms

Enigmistics.is_tautogramFunction
is_tautogram(s::AbstractString)

Check if a given string is a tautogram, i.e. if all words in the string start with the same letter.

See also scan_for_tautograms.

Examples

julia> is_tautogram("Disney declared: 'Donald Duck definitely deserves devotion'")
true

julia> is_tautogram("She sells sea shells by the sea shore.") # "by" and "the" break the s-streak
false
source
Enigmistics.scan_for_tautogramsFunction
scan_for_tautograms(text::String; 
    min_length_words=5, max_length_words=20, print_results=false)

Scan a text and look for sequences of words which are tautograms.

Return a vector of matches in the form (matching_range, matching_string).

Arguments

  • text: the input text to scan
  • min_length_words=5: consider only sequences with at least this number of words
  • max_length_words=20: consider only sequences with at most this number of words
  • print_results=false: whether to print results or just return them

See also is_tautogram.

Examples

julia> text = clean_read("../texts/paradise_lost.txt", newline_replace="/");

julia> scan_for_tautograms(text, min_length_words=5, max_length_words=20)
6-element Vector{Any}:
 (20801:20830, "and ASCALON, / And ACCARON and")
 (110257:110281, "Topaz, to the Twelve that")
 (136170:136194, "to taste that Tree, / The")
 (320005:320030, "her Husbands hand her hand")
 (450274:450301, "Though to the Tyrant thereby")
 (456113:456141, "Through the twelve Tribes, to")
source

Abecedaries

Enigmistics.is_abecedaryFunction
is_abecedary(s::AbstractString; language="en")

Check if a string is an abecedary, i.e. if it consists of a sequence of words whose initials are in alphabetical order.

The language parameter can be used to specify which alphabet to use (currently supported values are "en", English and "it", Italian). In particular, the english case consists of the vector collect('a':'z') while the italian one of the english vector having removed characters 'jwxyz'.

See also scan_for_abecedaries.

Examples

julia> is_abecedary( # from Isidoro Bressan, "La Stampa", 18/10/1986
       "Amore baciami! Con dolci effusioni fammi gioire! Ho illibate labbra, 
       meraviglioso nido ove puoi quietare recondita sensualità traboccante. 
       Ubriachiamoci vicendevolmente, Zaira!", language="it")
true

julia> is_abecedary("A Bright Celestial Dawn Emerges")
true

julia> is_abecedary("A Bright Celestial Dawn Rises") # R breakes the alphabetic streak
false

julia> is_abecedary("Testing: u v w x y z a b c") # wraps by default around the alphabet
true
source
Enigmistics.scan_for_abecedariesFunction
scan_for_abecedaries(text::String; 
    min_length_words=4, max_length_words=30, 
    language="en", print_results=false) 

Scan a text and look for sequences of words which are abecedaries.

Return a vector of matches in the form (matching_range, matching_string).

Arguments

  • text: the input text to scan
  • min_length_words=4: consider only sequences with at least this number of words
  • max_length_words=30: consider only sequences with at most this number of words
  • language="en": language used to determine the alphabet
  • print_results=false: whether to print results or just return them

See also is_abecedary.

Examples

julia> text = clean_read("../texts/paradise_lost.txt", newline_replace="/");

julia> scan_for_abecedaries(text, min_length_words=4, max_length_words=5, language="en")
3-element Vector{Any}:
 (102463:102490, "a boundless Continent / Dark")
 (368827:368846, "raging Sea / Tost up")
 (405485:405502, "and both confess'd")

julia> text = clean_read("../texts/divina_commedia.txt", newline_replace="/");

julia> scan_for_abecedaries(text, min_length_words=4, max_length_words=5, language="it")
7-element Vector{Any}:
 (42388:42410, "per questo regno. / Sol")
 (253993:254006, "che dire e far")
 (289587:289615, "crucifisso, dispettoso e fero")
 (461336:461357, "cera dedutta / e fosse")
 (468425:468450, "albor balenar Cristo. / Di")
 (504086:504116, "O predestinazion, quanto remota")
 (514226:514251, "con digiuno, / e Francesco")
source