The curriculum build specification

1. The document source layout

Our documentation sources are divided into pathways and lessons, which are in the subdirectories pathways and lessons respectively. A pathway is broadly a concatenation of a chosen ordered set of lessons (with some other pathway-specific resources thrown in).

In the following, we’ll use core as an example pathway, and simple-data-types as an example lesson.

1.1. Multiple natural languages

First, we may want to create the course core in multiple natural languages, e.g., English (en-us) or Spanish (es-mx). To accommodate this, the source dir pathway/core has a langs subdir, with further subdirs for each natlang that we want: i.e., en-us and es-mx. The source tree thus looks like:

pathway/
`—— core/
    `—— langs/
        |—— en-us/
        `—— es-mx/

We’ll call the pathway/core/langs/en-us the core pathway’s root directory (for the en-us natlang).

1.2. Multiple proglangs

By default, the programming language associated with a pathway is Pyret.

We may also want to create the courses for our core pathway for multiple programming languages, e.g., pyret and wescheme. To accommodate this, we include the file proglang.txt in the pathway root directory (pathways/core/langs/en-us/), and list in it the various proglangs supported. Thus the contents of proglang.txt will look like:

pyret
wescheme

Sometimes we have a pathway that has associated with it only a single proglang, but that proglang is not Pyret. As an example, consider the pathway shipwrecks which is defined for the proglang spreadsheets. In this case, we still need a proglang.txt file containing the single line

spreadsheets

1.3. Pathway narratives

The pathway root directory has a file index.adoc, which specifies the pathway narrative. The pathway root also has a resources/ subdirectory that houses additional documentation that is specific to teachers. The pathway root may also contain subdirs for front-matter, back-matter, and images: the former two add flanking matter to the final workbook for the pathway’s course, while images is a convenient location for image subfiles.

1.4. Lessons associated with a pathway

The pathway root directory has a file lesson-order.txt listing the names of the lessons associated with it, in order. These lessons occur as directories in the lessons/ directory in the repo top dir.

As with pathways, lessons can be in different natlangs and proglangs. To accommodate multiple natlangs, each lesson has a langs subdir, with subdirs (e.g., en-us, es-mx) for the different natlangs. (This is similar to how we specify multiple natlangs for pathways.)

lessons/
`—— simple-data-types/
    `—— langs/
        |—— en-us/
        `—— es-mx/

We’ll call lessons/simple-data-types/langs/en-us the lesson root directory.

As for pathways, a lesson root directory can contain a proglang.txt listing the proglang(s) associatable with it. If no proglang.txt is present, the (single) proglang is assumed to be pyret.

If more than one proglang is needed, or if the single proglang is something other than pyret, list it in proglang.txt.

1.4.1. Lesson plans

The lesson counterpart to the pathway narrative is the lesson plan, given as an index.adoc file in the lesson root directory. Here too, an images/ subdir is used for image subfiles.

1.4.2. Lesson workbook pages

A lesson typically contains a pages/ subdir, which consist of AsciiDoc sources for the pages in that lesson. The pages/ subdir has a file workbook-pages.txt, listing all these pages in order.

A lesson may also contain other subdirs, even within its pages/. These may contain files meant for inclusion in the main pages files.

1.4.3. Adding proglang-specific source

We’ve seen that both pathways and lessons may have proglang.txt identifying multiple proglangs as vehicles for that course/lesson. Proglang-specific material is specified in the source in two ways:

1: We use the directive @ifproglang{pyret}{…} to specify source fragments meant for the proglang pyret.

2: Especially in the pages/ subdirs, we use further subdirs named for the proglang to add files that would shadow the main files. Thus for a file pages/abc.adoc, the file pages/pyret/abc.adoc would shadow it for pyret, and file pages/wescheme/abc.adoc would shadow it for wescheme. If we provide shadowing files for all relevant proglangs, the main file doesn’t need to exist.

1.4.4. Solution-mode pages

The pages/ subdir in a lesson contributes to the student version of the final workbook. However, we also have solutions workbook that is intended for teachers. We have a solution-pages/ subdir alongside pages/, which contains files that will shadow the similarly named student pages.

1.5. Pathway-independent files

There are also some source files that are not affiliated with any particular pathway or lesson. These are kept in shared/langs/en-us/docroot (and the like, for other natlangs). Chief among these are textbook descriptions which are .adoc files in the textbooks/ subdir.

2. The distribution layout

The built version of a pathway — called a course — is created in a subdirectory called courses under the distribution subdir. The built version of a lesson — we’ll call it the distribution lesson — is similarly created in a subdir called lessons, also under the distribution suddir.

2.1. The distribution subdir

The courses and lessons are not direct subdirs of distribution, in order to accommodate different versions of pathways and lessons based on the natlang used. To effect this, the distribution has subdirs for each natlang (called en-us, es-mx, etc.), and it is in the courses and lessons subdirs of these that we find the natlang-specific courses and lessons. Thus, the two natlang versions of the pathway core are laid out as follows:

distribution/
|—— en-us/
|   `—— courses/
|       `—— core/
`—— es-mx/
    `—— courses/
        `—— core/

Note that we use the same name core for the two natlang versions in the distribution. We can do this because they reside inside two different subtrees distribution/en-us distribution/es-mx and thus don’t risk a name clash.

As with pathways, so with lessons: To accommodate multiple natlangs, the lessons go into the lessons subdir of their natlang. E.g., for the lesson simple-data-types, we have:

distribution/
|—— en-us/
|   `—— lessons/
|       `—— simple-data-types/
`—— es-mx/
    `—— lessons/
        `—— simple-data-types/

In the following, we will assume the prevailing natural language is en-us, so we will restrict our attention to the distribution/en-us subtree. For other natlangs, the setup is exactly the same but in the appropriate natlang-based subtree.

Also, we will call distribution/en-us/courses/core the course root directory (of the English version of the core pathway). The dir distribution/en-us/lessons/simple-date-types is the distribution lesson root directory (of the English version of the simple-data-types lesson).

2.2. Proglang associated with a course

Typically — as seen in the above example — the built names of the pathway and the lesson are the same as their names in the source, but not always.

By default, the programming language associated with a course is pyret. This is marked in the course root directory as the 0-length file .cached/.proglang-pyret. (I.e., distribution/en-us/courses/core/.cached/.proglang-pyret.)

But we as have seen, a pathway may have a proglang.txt listing a different or multiple proglangs.

Thus, for the pathway core, two different courses are created in distribution/en-us/courses. We continue to use the unadorned pathway name, core, for the Pyret version. For other proglangs, the course dir’s name uses the proglang as a hyphenated suffix. Thus for the WeScheme version, the course is entitled core-wescheme. I.e., the two course root directories for core are

distribution/en-us/courses/core
distribution/en-us/courses/core-wescheme

These two directories have their own proglang marker files, i.e., .cached/.proglang-pyret (as before) and .cached/.proglang-wescheme.

We often have pathways whose proglang.txt contains only a single proglang, but that proglang isn’t Pyret. E.g., the pathway shipwrecks whose proglang.txt contains only spreadsheets. In such a case, the built course is simply called shipwrecks rather than shipwrecks-spreadsheets, since there is no Pyret version to distinguish it from.

2.3. Proglang associated with a distribution lesson

As with pathways, lessons can be in different proglangs, also specified with proglang.txt. If no proglang.txt is present, the (single) proglang is assumed to be pyret. The distribution version of the non-Pyret lessons have the proglang as hyphenated suffix, e.g.,

simple-data-types-wescheme

Again, as with pathways, a proglang-marker file .cached/.proglang-wesceme is placed in the distribution lesson root to identify it as a lesson using wescheme.

Note that the pathway mentions in its lesson-order.txt the lesson names in unadorned form (no proglang suffix). On building, a pathway becomes a course, and the distribution lessons associated with it are the right versions corrected for proglang. Thus the pathway core has simple-data-types as a constituent lesson. Moving over to the built distribution, the course core (Pyret) includes the lesson simple-data-types (Pyret), and the course core-wescheme includes the lesson simple-data-types-wescheme.

2.4. The course narrative and other resources

The pathway narrative index.adoc in the pathway root eventually will get converted to index.shtml in the course root. The various .adoc files in the front-matter, back-matter and resources subdirs also will get converted to HTML files, with suffix .shtml at the top level and .html at lower depths, i.e., in any pages and solution-pages subdirs. However, these HTML conversions aren’t performed at this time.

2.5. The lesson plan and workbook pages

The lesson plan index.adoc in the lesson root eventually will get converted to index.shtml in the distribution lesson root.

The .adoc files in the pages/ and solution-pages/ subdirs along with any .adoc files they include will get converted to .html files alongside.

2.6. Pathway-independent files

The pathway-independent files in shared/langs/en-us/docroot are copied over to distribution/en-us.

3. The build phases

The build is accomplished in two phases, each a Makefile. Because of the multiplicity of files to keep track of, some generated during the course of the build, the Makefile rules are themselves generated using Makefile functions. However, the input files needed in the second phase are not available until the first phase is completed, hence the need to use a separate Makefile for the two phases.

3.1. The first build phase

The first phase initializes the distribution directory, then copies the lessons and pathways over, adjusting subdirectories, and creating proglang-specific copies as needed. It sets up the temp files that will contain the input date to run various conversions in batch mode. Two specific PDFs, the page-not-found and the bilingual glossary, are requested to start the ball rolling.

3.2. The second build phase

The second phase does the bulk of the conversions. Requests are set up in the batch-input files for passing the various adoc files to the Racket preprocessor, which will create corresponding asc files. These are then converted to html files via a batch call to Asciidoctor.

Different batch files are then used to pick up primitives, the lesson inter-dependencies, and global image and pathway-toc listings, and to add some postprocessing.

Finally the postprocessed HTML files are converted to PDFs, and these are assembled into workbook pages.

4. Rearranging subdirs within the distribution directories

The build process starts by copying (the English versions of) the pathway directories to distribution/en-us/courses and the lesson directories to distribution/en-us/lessons, sometimes creating multiple versions, as described in the previous section, to accommodate multiple proglangs as needed. (The other natlang versions go the corresponding subdir in distribution, e.g., es-mx, and their treatment follows a similar route, with en-us replaced.)

4.1. Modifying the distribution lesson directory

After copying a pathway into its distribution lesson directory with fixed proglang, the build process drills down its directories to see if there are any proglang-specific shadowing subdirectories. These are named for the proglang, e.g., pyret, wescheme, etc. The contents of the relevant proglang subdir are copied to the containing directory. All the other proglang subdirs are deleted. This ensures that all the files in the course directory, at whatever depth, have content appropriate to its proglang.

Next, if a subdirectory called pages occurs at any level — typically these are in the front-matter, back-matter and resources subdirs --, the build ensure that a subdir called solution-pages is alongside it. Furthermore, it ensures that its contents are the same as the pages subdir, except for such files that are pre-existent in it (from the repo).

Next the build looks for workbook-pages.txt inside these pages subdir. A sanitized version — removing comments — of this is placed in .cached/.workbook-pages.txt.kp, and an even sparer version — removing landscape/portrait information — is placed in .cached/.workbook-pages-ls.txt.kp.

4.2. Modifying the course directory

Course dirs undergo the same modifications above for proglang and solution-pages. Note here that the subdirs front-matter, back-matter, and resources may contain pages/ subdirs with the appropriate workbook-pages.txt, and these are mined to get .cached/.workbook-pages.txt.kp and .cached/.workbook-pages-ls.txt.kp as well, just as for the lessons.

5. Batch files for HTML conversion

The next stage of the build creates batch files that enshrine enough information to the converters. We do not simply drill down the distribution tree and convert the .adoc files immediately, as that would be too inefficient, given the converters have significant startup time. Instead, we create the batch files in distribution/en-us/.cached so the conversions can be done in one go. The names of these batch files are given by the following environment variables (which are set in the build makefile):

$ADOCABLES_INPUT
$ADOC_INPUT
$ADOC_POSTPROC_LESSONPLAN_INPUT
$ADOC_POSTPROC_NARRATIVEAUX_INPUT
$ADOC_POSTPROC_NARRATIVE_INPUT
$ADOC_POSTPROC_RESOURCES_INPUT
$ADOC_POSTPROC_PWYINDEP_INPUT
$ADOC_POSTPROC_WORKBOOKPAGE_INPUT
$PUPPETEER_INPUT
$EXERCISE_COLLECTOR_INPUT
$COURSES_LIST_FILE
$LESSONS_LIST_FILE

(The actual names used are not germane to this discussion.)

$ADOCABLES_INPUT

The $ADOCABLES_INPUT file has entries for each adoc file that are given to the Racket preprocessor. Each entry is of the form:

("<adoc-filename>" #:containing-directory <string>
                   #:dist-root-dir <string>
                   #:lesson-plan <string>
                   #:lesson <string>
                   #:otherdir <boolean>
                   #:resources <boolean>
                   #:solutions-mode? <boolean>
                   #:proglang <string>
                   #:other-proglangs <list>
                   #:narrative <boolean>
                   #:target-pathway <string>)

Not all the keywords are necessary for all the adoc files.

The <adoc-filename> is the basename of the adoc file. The #:containing-directory value is given relative to distribution/en-us. The #:dist-root-dir value is the pathname of distribution/en-us relative to the adoc file. These three values are not optional.

$ADOC_INPUT

This contains a list of .asc files, which are intermediate files created by the Racket preprocessor from the .adoc files listed in $ADOCABLES_INPUT before eventual conversion by AsciiDoctor into HTML.

These .asc files are always inside a .cached subdir co-level with the original .adoc. The pathnames of these .asc files are given relative to distribution/en-us.

The HTML files created by AsciiDoctor also reside in the .cached subdir and go through a postprocessing phase before moving to the directory above.

$ADOC_POSTPROC_*_INPUT

These are the files

$ADOC_POSTPROC_LESSONPLAN_INPUT
$ADOC_POSTPROC_NARRATIVEAUX_INPUT
$ADOC_POSTPROC_NARRATIVE_INPUT
$ADOC_POSTPROC_RESOURCES_INPUT
$ADOC_POSTPROC_PWYINDEP_INPUT
$ADOC_POSTPROC_WORKBOOKPAGE_INPUT

These list the names of the HTML files awaiting postprocessing. They are given in six lots, because the type of postprocessing varies for:

lesson plan HTML files
glossary HTML files in courses
pathway narrative HTML files
pathway resource HTML files (i.e., pathway files other than the narrative)
pathway-independent HTML files (neither lessons nor pathways)
workbook HTML files (in the lessons)

The filenames in these files are relative to distribution/en-us.

$PUPPETEER_INPUT

This is a JSON file containing the list of postprocessed HTML files that are ready to be converted to PDF.

*_LIST_FILE

These are:

$COURSES_LIST_FILE
$LESSONS_LIST_FILE

These contain the list of all the courses and lessons respectively. While they can be obtained by listing the concerned subdirs in distribution/, they are used often enough that it is useful to cache these values.

6. Populating the conversion batch files

6.1. Pathway-independent files

The names of the pathway-independent .adoc files under distribution/en-us/textbooks are collected.

The .adoc file, together with its #:containing-directory and #:dist-root-dir value is added to $ADOCABLES_INPUT.

The .asc file (which sits in the .cached subdir), pathname relative to distribution/en-us is added to $ADOC_INPUT.

The .html file (not yet postprocessed, sits in .cached), pathname relative to distribution/en-us) is added to $ADOC_POSTPOC_PWYINDEP_INPUT.

6.2. Workbook pages

We collect all the workbook-page .adoc files and add them to $ADOCABLES_INPUT. The #:containing-directory, #:dist-root-dir, #:lesson, #:other-dir, #:solutions-mode? and #:proglang values are included.

#:lesson is simply the basename of the distribution lesson.

#:other-dir is a boolean identifying if the file is simply meant for inclusion in other adoc files and not for full-scale conversion themselves. Such files are located in subdirs named fragments, xtra, or xtras.

#:solution-mode? is a boolean set to true only for adoc files in solution-pages. Such files contain information meant for teachers but not students.

The .asc version of the file is added to $ADOC_INPUT.

The cached .html version of the file is added ot $ADOC_POSTPROC_WORKBOOKPAGE_INPUT.

6.3. Lesson-plan files

We collect all the lesson-plan .adoc files and add them to $ADOCABLES_INPUT. The #:containing-directory, #:dist-root-dir, #:lesson-plan, #:proglang, and #:other-proglangs values are included.

#:lesson-plan is simply the basename of the distribution lesson.

#:other-proglangs is a list of the other proglangs for which this lesson is available. This is used to have the lesson plan include links to the plans for the same lesson in the other proglangs.

The .asc version of the file is added to $ADOC_INPUT.

The cached .html version of the file is added ot $ADOC_POSTPROC_LESSONPLAN_INPUT.

6.4. Pathway resource files

Collect all the .adoc files except the narrative file and add them to $ADOCABLES_INPUT. Other values included: #:containing-directory, #dist-root-dir, #:other-dir, #:resources, #:target-pathway, #:solutions-mode?, #:proglang.

#:resources is set to true.

#:target-pathway is the (base) name of the course directory (with the proglang suffix, if any).

The .asc file is added to $ADOC_INPUT.

The cached .html file is added to $ADOC_POSTPROC_RESOURCES_INPUT.

6.5. Pathway narrative files

Collect all the narrative .adoc files in the distribution and add then to $ADOCABLES_INPUT. Other values included: #:containing-directory, #:dist-root-dir, #:narrative, #:target-pathway, #:proglang, #:other-proglangs.

#:narrative is set to true.

#:other-proglangs is the list of other proglangs for which this course is available, so they can link to each other.

The .asc file is added to $ADOC_INPUT.

The cached .html file is added to $ADOC_POSTPROC_NARRATIVE_INPUT.

We also the .asc file for the pathway glossary (this is generated by the Racket preprocessor) to $ADOC_INPUT. Its cached .html file is added to $ADOC_POSTPROC_NARRATIVEAUX_INPUT.

7. Converting .adoc to .html

7.1. Racket preprocessor

After all the $ADOC… batch files have been populated, the build runs a Racket preprocessor on the entries in $ADOCABLES_INPUT to create the .cached/*.asc files.

This is because the .adoc files created by the curriculum authors contains some directives in addition to the commands provided by raw AsciiDoc. A Racket program preprocesses these away to create a raw AsciiDoc file. This has the extension .asc and is situated in a .cached subdir alongside the .adoc file penned by the author.

The additional keyword values provided in the $ADOCABLES_INPUT enables the Racket preprocessor to correctly address the different types of .adoc files we have (workbook pages, lesson plans, narratives, etc.).

A single Racket call is sufficient to process all the entries in $ADOCABLES_INPUT.

7.1.1. Primitives used

When processing lesson files (whether lesson plan, work page, or a file included by them), the primitives found in it are stored in temp files in .cached. These become useful when creating a collective description of all the courses and lessons.

7.1.2. Exercise files

Special directives — @printable-exercise, @opt-printable-exercise, @handout — are used to include files within the lesson directory that are meant to be exercises. Their names are stored in .cached/.lesson-exercises.txt.kp. Again these are useful for the collective description.

7.2. Calling Asciidoctor

Once the Racket preprocessor done, we have a bunch of .cached/*.asc files in the distribution, and the list of these we have already captured in the file $ADOC_INPUT.

These files are now ready to be processed by Asciidoctor to produce the corresponding .html.

The asciidoctor command can take a bunch of input files as command-line input, so we pass it the contents of $ADOC_INPUT.

8. Postprocessing

When the Racket preprocessing and the Asciidoctor runs complete, we have a bunch of .cached/*.html files in our distribution tree. It would be nice if these were all there is to it, but there is still a bunch of post-processing left. We use the other $ADOC_POSTPROC_* files to guide this post-processing.

For starters, the .cached/*.html files need to go up one directory, so they sit alongside their source .adoc files in the distribution tree. (We may later choose to delete the .adoc files before deployment to the website, but that’s a different matter.) In particular, we need to ensure that the final HTML file has extension .shtml when it’s a lesson plan or pathway doc, either narrative or resources. It is .html in all other cases, i.e., workbook pages and pathway-independent files.

The following are the components of the postprocessing:

CSS path correction: When Asciidoctor is run on the .asc file, it enshrines a pathname for curriculum.css. Make it relative to the HTML location. (Unfortunately, Asciidoctor can’t take a relative pathname correctly from command-line when called on files that aren’t in the calling directory.)
Boilerplate insertion. Include links to various .css and .js files. Sometimes conditionally: Codemirror-related links are included only if the file has tags classed pyret, racket, or circleevalsexp.
Nested span and div insertion. Asciidoctor lacks the ability to nest more than one level of div or span with specific classes. The Racket preprocessor embeds markers for these so they go through Asciidoctor verbatim. We can now scan for them and replace them with divs/spans as appropriate.
Remove intrusive spans in headers, add self-link to h2’s
Create Google-Drive versions

In general, it would be nice to reduce the amount of postprocessing needed. However there are some limitations to how much we can autoinsert during the preprocessing and Asciidoctor-ing phases. View this purely as the unavoidable finishing touch. Hopefully we can reduce this with time.

9. Collective information

The build also creates four files in distribution/en-us that combine glossary-type information from all the courses and lessons. They are: the dependency graph, the images list, the pathway ToCs, and the bilingual glossary.

9.1. The dependency graph

The lesson dependency graph dependency-graph.js is generated by the program make-dependency-graph.lua. It shows, for each lesson:

its title
its description
the pages it contains
its exercise pages
the primitives needed for it
its key words
its prerequisite lessons

The primitives are generated for each lesson as .cached/.index-primitives.txt.kp by the program collect-primitives.lua.

Note that the JS file represent the collected information as a JSON object set to a variable. The file itself isn’t JSON. This also holds for the next two JS files described below.

9.2. Images list

A glossary of image information, images.js, is compiled from the images/lesson-imags.json file in each lesson.

9.3. Pathway ToCs

A list of all the courses with their constituent lessons is generated in pathway-tocs.js.

9.4. Bilingual glossary

In addition to these JS files, a bilingual glossary bilingual-glossary.html is created in distribution/en-us/lib. This is a browsable version of the file glossary-terms.rkt in the repo.

10. PDFs and Workbooks

A course has six types of workbooks.

Each workbook is a concatenation of pages.

Most of the workbook pages are from the course’s constituent lessons, and these are listed in the lesson’s pages/workbook-pages.txt. In addition, a course’s workbook may include front and back matter, which are specified within the pathway dir itself. Some workbooks may contain exercise pages, which are referred to in the lesson plan, are present in the lesson’s pages/ dir, but not listed in its workbook-pages.txt.

The three student-facing workbooks are in the workbook/ subdir of the course:

workbook.pdf
workbook-long.pdf
opt-exercises.pdf

The workbook.pdf contains the basic lesson pages. The -long version includes the optional exercises referred to in the lesson plans. The opt- version contains just the exercises.

The three teacher-facing workbooks are named similarly but for the -sols suffix, and reside in the resources/protected subdir, which is password-protected:

workbook-sols.pdf
workbook-long-sols.pdf
opt-exercises-sols.pdf

10.1. Creating page PDFs

The script make-pdf.sh passes the list of HTML files generated in $PUPPETEER_INPUT to the Node program html2pdf.js, creating the corresponding PDFs.

10.2. Workbooks

The program make-workbook-jsons.lua scans the course and lesson dirs for various temp files create the previous portions of the build to create JSON files that list the constituent PDFs for each type of workbook.

The script make-books.sh uses these JSON files to create the final workbook PDFs.

11. Implementing the build process using Makefile(s)

The build was originally implementated as a long Bash script build-pathway, but following Issue 467, has been rewritten as a more maintainable collection of Makefiles that calls much smaller subroutine-like scripts.

The Makefile resides at the top dir, and a build is initiated by calling make, possibly with the following options:

BOOK=yes to generate the individual and workbook PDFs
LINKCHECK=yes to verify all the internal and external links used
NATLANG=<natlang> to generate the docs in a language different from the default en-us, e.g., es-mx
SEMESTER=<season> to identify the season, e.g., fall
YEAR=<year> to identify the year, e.g., 2023

The top-level Makefile includes Makefile.all, which resides in lib/maker alongside all the other auxiliary makefiles and programs used by the make process.

Makefile.all calls two makefiles in order: Makefile.phase1 and Makefile.phase2. Both makefiles use make functions to create a set of related rules, as the number of rules cannot be determined beforehand. Furthermore, we use two makefiles to run two phases of the build, because the sources for the (generated) rules of the second phase cannot be determined until the first phase is done. We need only two sequential makefiles.

11.1. make, phase 1

Makefile.phase1 initializes the distribution/ directory, with subdir en-us/, which has subdirs lib/, extlib/, lessons/, courses/. It also zeroes out the various temp files in en-us/.cached, which contain lists of files that will eventually fed as a batch to various processing scripts. We will mention these batch files as they come up in the process.

Using generated make rules, distribution/en-us/courses/ is populated with copies of the pathways, and distribution/en-us/lessons/ with copies of the lessons. If a lesson allows multiple proglangs (e.g., pyret, wescheme, codap, etc.), it is duplicated for each such proglang, with the proglang added as a suffix, except for pyret and none, which do not get a suffix.

The courses and lessons in the distribution are "massaged" to create solution-pages/ alongside any pages/ subdirs, and to move any shadowing files specifically meant for the prevailing proglang to overwrite files that are generic or meant for other proglangs. The massaging is done by two external scripts: massage-distribution-lesson.sh and massage-course.sh. Both these scripts make use of a program collect-workbook-pages.lua to identify within each lesson (or lesson-like entity like front-matter and back-matter in a course) all the pages that are eligible to go into the workbook — the so-called workbook pages.

After populating (or updating) the en-us/{courses,lessons}, Makefile.phase1 collects all the exercises in the lessons (these are specific exercise directives used in the adoc source of the lessons). These are placed in .cached/ subdirs in the individual lesson pages/.

In addition, phase 1 also takes care of creating the HTML version of the bilingual glossary file, and adds it to the batch file $PUPPETEER_INPUT. $PUPPETEER_INPUT will eventually contain all the HTML files that will need to be converted into PDF pages.

11.2. make, phase 2

After phase 1 is done, Makefile.phase2 picks up the next phase. It identifies the different kinds of .adoc files (already copied under {lessons, courses} in the distribution, and creates make rules for converting these to .asc files using the Racket-based preprocessor. This is accomplished by updating three batch files: $ADOCABLES_INPUT, $ADOC_INPUT and a third $ADOC_POSTPROC_*_INPUT file whose name depends on whether the adoc file in question is a lesson plan, a pathway narrative file, a pathway glossary file, a pathway resources file, a pathway-independent file, or a workbook page (which can be in both lessons and pathways).

After these batch files are updated, an external script run-asciidoctor.sh converts the files in $ADOCABLES_INPUT to their corresponding .asc (also an asciidoc file), using adocables-preproc.rkt, an adoc preprocessor written in Racket. It then uses the $ADOC_INPUT batch file containing the list of .asc’s as input to Asciidoctor. A set of .html files results alongside the .asc’s.

The preprocessing also collects the primitive functions used (if any) in each of the lesson pages, and notes down the lesson prerequisites for each lesson.

If the make var BOOK is set, the preproc rules include a further set of rules that add lines to $PUPPETEER_INPUT, so that the HTML files posited by the preproc (but only finally created after the postproc) will also become candidates for PDF conversion.

This completes the preproc subphase. After it is done, the following bunch of other rules come into play (in no temporal order):

The primitives are then consolidated per lesson, using the external script collect-primitives.lua.
The $ADOC_POSTPROC_*_INPUT batch files are now used by a postprocessing program, do-postproc.lua, that creates the final html or shtml file in the correct directory (i.e., alongside the original .adoc files). In addition, Google-drive-ready copies of these (s)html files are also created.
If the make variable LINKCHECK is set, a make rule uses the external script do-link-check.sh to verify that all the internal and external links used in the documents really do exist.
An external script make-pathways-tocs.lua collects the ToC info for all the courses into en-us/pathway-tocs.js.

After the primitive generation rule (1 above) is done, a rule calls an external script make-dependency-graph.lua that gathers the generated info about the lesson primitives and prereqs to create a global lesson dependency graph in en-us/dependency-graph.js.

An external script make-images-js.lua collects all the image information from the lessons into a global image glossary at en-us/images.js.

After the postproc rule (2 above) is done, if the make variable BOOK is set, the batch file $PUPPETEER_INPUT is used by a Node program html2pdf.js to create all the individual page PDFs.

After these individual PDFs are available, and again only if BOOK is set, a make rule uses the external program make-workbook-pages.lua to go into each course, collects all its constituent lessons' workbook and exercise pages into six course-specific list of pages for the six different types of workbooks. The same rule then calls the Node program makeWorkbook.js to create each course’s set of workbook PDFs.

12. Using the Makefile

The Makefile in the top directory allows to you build the documentation system. Simply type make: This creates or updates the distribution subdirectory with the various files needed for the user.

You may also set a small number of environment variables on the make command-line to further guide the build. These variable settings may be combined.

12.1. PDF and workbook generation

By default, make won’t create the PDF versions of the HTML files, nor the collections of them that form the various workbooks. PDF and workbook construction can be time-consuming and so is best relegated to the final run, when the author is sure that the HTML conversions have been thoroughly debugged. Once they are, and the PDFs are desired, request the target book on the make command-line, i.e.,

make book

12.2. Link checking

By default, the checking of the various internal and external links in the documents are not checked — because it can be a time-consuming process. (External-link checking is particularly heinous because it takes a long while for each URL probe to return.)

To enable link checking, use the target linkcheck:

make linkcheck

It is often advisable to do the link check just for its own sake, and only once in a while, after the distribution dir is already in place from a previous make.

12.3. Making from scratch

To remove a previously made distribution entirely, do

make clean

The next make — with whatever options you desire — will be on a fresh slate.

12.4. Deploying to website

To deploy a built distribution to the website, do

make deploy

Deployment is described in more detail in the repo’s README.

13. For users of legacy script build-pathway

Before the Makefile-based system was implemented, builds were done using a shell script called build-pathway. For those familiar with its usage and unable to let go, a new script build is provided whose API is similar to build-pathway, and which is actually a wrapper that calls the new make under the hood.

The new name build is used because build-pathway (singular) would now be a bit of a misnomer: the underlying make always creates or updates all pathways (i.e., courses). Nevertheless, for those who really can’t help using the old name, build-pathway is provided a trivial identity wrapper for build. Just remember that you cannot supply names of specific pathways to build: Such arguments are ignored with a helpful diagnostic.

build takes the following options:

--book (aka --pdf, -b): generates PDFs, both for individuals and the workbooks
--deploy: deploys to website. Uses existing distribution/, if not, makes it first. May be preceded with settings for SKIPLIB, SEASON, YEAR
--force (aka --superforce, --super-force, -f, -F): builds from scratch, scrubbing any previous distribution/
--help (aka -h): displays help and exists
--link (aka --verify-links, -l): verify all the links in the documentation
--natlang L: builds doc for the natural language L (default: en-us)

Options may be combined in any order.

The --help and --version options overwhelm all other options: they display info but do not build.

Use -fb or -bf for the often useful combination of --force and --book, used to make a final-cut distribution including the workbook PDFs after the source has been debugged to satisfaction.

An experimental option --adocjs is provided that tells the build to use the Node version of Asciidoctor rather than the Ruby one.