CSS

Wednesday, March 23, 2022

Zettelkastens and Literate Programs

I've been re-reading John Harrison's Handbook of Practical Logic and Automated Reasoning (a wonderful book), with an eye towards writing notes for my Zettelkasten.

What I have done, out of habit, is I've reworked the code in Standard ML (as opposed to OCaml). I've added a "register" (mathematicians would recognize them as "theorem environments" in LaTeX) for source code, and the template for writing such notes on a 5.5 inch by 4.25 inch (A6) slip of paper looks like:

[id number]Code [name of chunk]
[English summary of the code, or the algorithm, or...]
⟨[name of chunk]⟩≡
[Standard ML code here]

Just replace the bracketed italicized text (including the brackets) with your own text. I do write a horizontal line separating the "code block" from the "text block", and underline the "Code" part of the title.

I follow the general heuristics Knuth suggests for literate programming, even using the names of code chunks in angled braces as if they were statements. Following some of the early handwritten documents on Standard ML, I underline keywords like "if", "then", "else", "case", "fun", etc., in the code block.

While reading Harrison's book, I add unit tests, write up examples, etc. This isn't done "in the abstract", I actually type the code up on a computer (e.g., here is my implementation of most of the propositional logic code).

One advantage to this is that we can prove properties about the code snippet in branches off the code zettel. Another advantage, giving a good human-level summary of the code snippet helps greatly with Harrison's book, since the techniques from chapter 3 make another appearance in chapter 6 (i.e., 300 pages later) and I forget some of the nuances surrounding the implementation ("Is skolem called before or after nnf?"), I can just glance at the snippets for guidance.

Another useful aspect of using a Zettelkasten for studying theorem provers is that it facilitates blending notes from multiple books seamlessly. I noticed this accidentally, when I was unsatisfied with Harrison's description of Herbrand universes, then started reading Chang and Lee's Symbolic Logic and Mechanical Theorem Proving (1973) description of Herbrand universes.

There are other aspects of Zettelkasten which seems uniquely suitable for studying theorem provers, but I digress.

I'm experimenting with using a Zettelkasten for literate programming "by hand" more generally, which is a bizarre but fascinating experience, especially when exploring monads in Standard ML.

Anyways, I just wanted to make note of this quirky note-taking system for literate programming.

Monday, December 6, 2021

On Structured Proofs and the Declarative Proof Style

1. Suppose we have successfully formalized the classification theorem for finite simple groups in your favorite theorem prover (Lean, HOL, Coq, Mizar, whatever). In a century's time (or several centuries from now), do you think your theorem prover will still be around and backwards compatible? If not, what good would the formalization be?

1.1. Remark (Concerns have been partially realized). We stress, despite the concerns raised being hypothetical and distant, they have been partly realized already. The Automath project's efforts to translate Landau's Foundations of Analysis would have been lost to history but for Freek Wiedijk's heroic efforts. Or, hitting closer to home for many, Lean 3 broke compatibility with Lean 2; and, as I understand it, there still remains a large quantity of Lean 2 libraries not yet ported. Worse, Lean 4 broke compatibility with Lean 3, and the Xena project's library has not yet ported to Lean 4. The idea that we would encode mathematical knowledge in one single theorem prover seems riskier than encoding it in natural language. Why not try to have the best of both worlds?

2. Readable Proofs. One of the roles for proofs in mathematics is communication. Hence we need something analogous to the notion of "readable code" in programming: readable proofs. Consider the following proof, tell me what it does (no cheating!):

prove
 ('!p q. p * p = 2 * q * q ==> q = 0',
  MATCH_MP_TAC num_WF THEN REWRITE_TAC[RIGHT_IMP_FORALL_THM] THEN
  REPEAT STRIP_TAC THEN FIRST_ASSUM(MP_TAC o AP_TERM 'EVEN') THEN
  REWRITE_TAC[EVEN_MULT; ARITH] THEN REWRITE_TAC[EVEN_EXISTS] THEN
  DISCH_THEN(X_CHOOSE_THEN 'm:num' SUBST_ALL_TAC) THEN
  FIRST_X_ASSUM(MP_TAC o SPECL ['q:num'; 'm:num']) THEN
  POP_ASSUM MP_TAC THEN CONV_TAC SOS_RULE);;

Or the following:

Proof.
  rewrite EL2 with (p_prime := p_prime) (p_odd := p_odd) by (omega || auto).
  rewrite EL2 with (p_prime := q_prime) (p_odd := q_odd) by (omega || auto).
  rewrite m1_pow_compatible. rewrite <- m1_pow_morphism.
  f_equal. rewrite <- QR6. unfold a. unfold b. auto.
  rewrite <- Zmult_0_r with (n := (p - 1) / 2).
  apply Zmult_ge_compat_l. omega. omega.
Qed.

I honestly wouldn't guess the first example had to do with proving ${\sqrt{2}\notin\mathbb{Q}}$, nor would I have imagined the second had to do with the law of quadratic recipricocity. (The first in HOL Light, the second in Coq.) What do you suppose mathematicians of the 25th century would think of these proofs? Would our efforts formalizing mathematics in a theorem prover yield fruit akin to Euclid's Elements, or would it be more indecipherable than Russell and Whitehead's Principia?

Hence, I would like to assess the following question:

3. Puzzle. What qualities make a proof "readable"?

4. Sources. We're not in the wilderness on this matter. Some very smart people have thought very hard about this question. But we haven't progressed much in the past few decades, compared to designing programming languages. We all seem to draw waters from the same well.

Andrzej Trybulec created Mizar [10] specifically for readability. Over the years, Mizar went from an abstract idea to a concrete realization, iteratively improving its input language to make it resemble a working mathematician's proofs. A cynical critic may deride Mizar as merely natural deduction in Pascal syntax, but that concedes the point: it's readable and understandable on its own, without relying on the computer to tell us the "proof state" and remaining goals in a proof. (Empirically, it turns out that Mizar reflects how mathematicians write, based on a cursory glance at 40k articles; see Josef Urban's email to the Mizar mailing list.)

(For our earlier examples, I personally think that Mizar's proof of the theorem of quadratic recipricocity, Th49 of int_5.miz, while long, is clear and readable...even if I don't fully understand Mizar. Similarly, the proof that $\sqrt{2}\notin\mathbb{Q}$ is the first theorem in irrat_1.miz. It's longer than I would find in a textbook, as are all formal proofs, but I have no difficulty making sense of it.)

The other major source of inspiration, I think, is Mark Wenzel's Isar front-end for Isabelle [13] (and enlightening discussion of a prototype [12]). Here the system emerges bottom-up, as discussed in §3.2 of Wenzel's thesis [13]. Other theorem provers attempted to simply copy/paste Isar commands, but apparently they never caught on. I suspect this is due to copying the superficialities, rather than drawing upon the underlying principles, produced an awkward input language.

There are other sources worth perusing, like Zammit's "Structured Proof Language" [17] among others. We also mention Isar inspired many prototypes to adapt Mizar-style proofs to other theorem provers (Corbineau's Czar [2], Giero and Wiedijk's MMode [3], Harrison's Mizar Mode for HOL [4]). Also worth mentioning is ForTheL [11] which emerged from the Soviet SAD theorem prover [9], possibly a future input language for Lean.

I especially want to point out ForTheL [11] as a stand-out, partly because it was designed behind the Iron Curtain, and partly because it's more of a controlled language which resembles natural language.

5. Declarative proofs. One bit of terminology, John Harrison [5] calls the structured proof style "declarative style proofs", which seems fine. Some fanatics of the procedural camp dislike the term. Giero and Wiedijk [3] point out the differences between procedural and declarative styles are:

  1. a procedural proof generally run backward (from goal to the assumptions), whereas declarative proofs run mostly forwards (from the assumptions to the goal);
  2. a procedural proof generally does not have statements containing the statements which occur in the proof states, whereas declarative proofs do;
  3. a procedural proofs have few proof cuts, whereas declarative proofs have nearly one cut for each proof step.

It turns out these qualities do make a difference on the readability of a proof script. But this is a bit like trying to learn an author's writing style by examining the statistics of the author's writing, like "Average number of words per sentence" or "Average number of syllables per word".

6. Natural deduction, vernacular. A key insight worth discussing further is that declarative/structured proofs emerge from the natural deduction calculus. Curiously, Jaśkowski [6] (who invented natural deduction independent of, and earlier than, Gentzen) does this in his original paper. Wiedijk [15] considers this avenue using a formal grammar for the proof steps, leaving notions like "term" and "formula" [i.e., the heart of a proposition/theorem] informal. So far, so good; but this has been predominantly first-order logic. How do we articulate readable proofs for type theory?

7. We construct proof steps for a declarative proof style by assigning "keywords" to rules of inference in natural deduction.

The problem with trying to find a readable proof format from natural deduction for type theory is, well, there are a lot more inference rules in the calculus of constructions (as opposed to, say, first-order logic).

The Lean4 developers have unwittingly started the road to reinventing lambda-typed lambda calculus, which has fewer rules and ostensibly could be encoded into a declarative proof style input.

8. One linguistic avenue may be found in de Bruijn's "mathematical vernacular" [1], which then evolved into WTT [7].

8.1. Remark (Soft typing). What's worth noting is de Bruijn [1] states in §1.17, "In particular we have a kind of meta-typing (the 'high typing', see Section 3.6) in the language, whereas most other systems would have such things in the metalanguage." We now call such 'high typing' either 'soft typing' [16], or 'semi-soft typing' [8].

9. Another avenue would be to weaken existing structured proof formats, so they become independent of foundations (e.g., instead of writing "${x\in\mathbb{N}}$", we would write "${x}$ be a natural number" — type theory provers would translate this into "${x{:}\mathbb{N}}$", and set theory provers would translate this into "${x\in\mathbb{N}}$" as before).

If GNU Hurd's failures have taught us anything about engineering, it's that "a bird in the hand is worth two in the bush." That is to say, it's wiser to take an existing working framework, and modify it slowly until it becomes what we want.

10. Conclusion? If this post seems less-than-satisfying, that's probably because I've only identified a problem. No solutions have been offered. That's because this is a hard problem somewhere between language design (very hard) and mathematical logic. Since we lack a robust framework for designing languages, this is far more of an art than science.

Although we have fewer than a half-dozen examples of declarative proof styles (really, maybe 3?), we have some avenues of exploration. But the design space may be much larger than we realize: we just lack the tools to describe it adequately.

One closing thought I'd pass off, though, is that type theorists encode logic in a rather protracted manner. It may be that structured proofs for type theory would be forced to "compile down" into many more steps, or maybe we just haven't adequately found a way to connect type theory to working mathematics.

Bibliography

  1. Nick de Bruijn, "The Mathematical Vernacular". In Proceedings of the workshop on programming logic (eds. Peter Dybjer, Bengt Nordstrom, Kent Petersson, et al.), 1987. https://pure.tue.nl/ws/portalfiles/portal/4318923/599592.pdf

    See also F.3 of Selected Papers on Automath for a revised version of this manuscript.
  2. Pierre Corbineau, "A Declarative Language For The Coq Proof Assistant". In Marino Miculan, Ivan Scagnetto, and Furio Honsell, editors, Types for Proofs and Programs, International Conference, TYPES 2007, Cividale des Friuli, Italy, May 2-5, 2007, Revised Selected Papers, volume 4941 of Lecture Notes in Computer Science, pages 69–84. Springer, 2007. https://kluedo.ub.uni-kl.de/frontdoor/deliver/index/docId/2100/file/B-065.pdf
  3. Mariusz Giero and Freek Wiedijk, "MMode: A Mizar Mode for the proof assistant Coq". Technical report, ICIS, Radboud Universiteit Nijmegen, 2004. https://www.cs.ru.nl/ freek/mmode/mmode.pdf
  4. John Harrison, "A Mizar Mode for HOL". In Proceedings of the 9th International Conference on Theorem Proving in Higher Order Logics, TPHOLs '96, volume 1125 of LNCS, pages 203–220. Springer, 1996. https://www.cl.cam.ac.uk/ jrh13/papers/mizar.html
  5. John Harrison, "Proof Style". https://www.cl.cam.ac.uk/ jrh13/papers/style.html
  6. Stanisław Jaśkowski "On the Rules of Suppositions in Formal Logic". Studia Logica 1 (1934) pp.5–32. https://www.logik.ch/daten/jaskowski.pdf
  7. Fairouz Kamareddine and Rob Nederpelt, "A refinement of de Bruijn's formal language of mathematics". Journal of Logic, Language, and Information 13, 3 (2004), pp.287–340
  8. Cezary Kaliszyk and Florian Rabe, "A Survey of Languages for Formalizing Mathematics". https://arxiv.org/abs/2005.12876
  9. Alexander Lyaletski, "Evidence Algorithm Approach to Automated Theorem Proving and SAD Systems". http://ceur-ws.org/Vol-2833/Paper_14.pdf
  10. Roman Matuszewski and Piotr Rudnicki, "Mizar: The First 30 Years". Mechanized Mathematics and Its Applications 4, 1 (2005) pp.3–24. http://mizar.org/people/romat/MatRud2005.pdf
  11. Andrei Paskevich, "The syntax and semantics of the ForTheL language". Manuscript dated December 2007, http://nevidal.org/download/forthel.pdf
  12. Markus Wenzel, "Isar — a Generic Interpretative Approach to Readable Formal Proof Documents". In Y. Bertot, G. Dowek, A. Hirschowitz, C. Paulin, L. Thery, editors, Theorem Proving in Higher Order Logics, TPHOLs'99, LNCS 1690, (c) Springer, 1999. https://www21.in.tum.de/ wenzelm/papers/Isar-TPHOLs99.pdf
  13. Markus Wenzel, "Isabelle/Isar — a versatile environment for human-readable formal proof documents". PhD thesis, Institut für Informatik, Technische Universität München, 2002. https://mediatum.ub.tum.de/doc/601724/601724.pdf
  14. Markus Wenzel and Free Wiedijk. "A comparison of the mathematical proof languages Mizar and Isar". Journal of Automated Reasoning 29 (2002), pp.389–411. https://www21.in.tum.de/~wenzelm/papers/romantic.pdf
  15. Freek Wiedijk, "The Mathematical Vernacular". Undated manuscript (available since at least 2004), https://www.cs.ru.nl/ freek/notes/mv.pdf
  16. Freek Wiedijk, "Mizar's Soft Type System". In: K.Schneider and J.Brandt (eds.), Theorem Proving in Higher Order Logics 2007, LNCS 4732, 383–399, 2007. https://cs.ru.nl/F.Wiedijk/mizar/miztype.pdf
  17. Vincent Zammit, (1999) "On the Readability of Machine Checkable Formal Proofs". PhD Thesis, Kent University, 1999. https://kar.kent.ac.uk/21861/

Thursday, December 2, 2021

Subgroups: Normal and...Abnormal...

1. Definition. Let ${G}$ be a group; denote its group structure as the tuple ${(G,e_{G},\star_{G},\iota_{G})}$ where ${\iota_{G}}$ is the inverse operator, ${e_{G}}$ is the identity element, ${\star_{G}}$ is the binary composition operator.

We define a Subgroup of ${G}$ to consist of a subset ${H\subseteq G}$ equipped with ${G}$'s group structure, i.e.,

  1. the binary operation restricted to ${H}$, ${\star|_{H}\colon H\times H\to G}$
  2. the inverse operator ${\iota|_{H}\colon H\to H}$
  3. the identity element from ${G}$, ${e_{G}\in H}$
such that
  1. the group composition is closed on ${H}$, i.e., ${\{h_{1}\star|_{H}h_{2}|h_{1},h_{2}\in H\}\subseteq H}$
  2. group inversion is closed in ${H}$, i.e., ${\iota|_{H}(H)\subseteq H}$
  3. the usual group properties holds for this group structure induced on ${H}$.
If further ${H\neq G}$, we call ${H}$ a Proper Subgroup of ${G}$.

1.1. Remark. We denote ${H<G}$ if ${H}$ is a proper subgroup of ${G}$, and ${H\leq G}$ if ${H}$ is a generic subgroup of ${G}$.

2. Example (Trivial subgroups). For any group ${G}$, there are always two subgroups available: ${G}$ itself, and the trivial subgroup ${\mathbf{1}}$ consisting of only the identity element. Since these subgroups come "for free", we call them Trivial Subgroups.

3. Example (Subgroups of dihedral group). Let ${n\in\mathbb{N}}$ be greater than 2. Recall the dihedral group ${D_{n}}$ consists of rotations by ${2\pi/n}$ radians and reflections. We have two subgroups (at least): one generated by rotations alone, the other generated by reflections alone.

The subgroup of rotations is a finite group consisting of ${n}$ elements. It's a cyclic group, isomorphic to ${\mathbb{Z}/n\mathbb{Z}}$.

The subgroup of reflections is less exciting. There are two elements in it: reflection about the ${x}$-axis, and the identity transformation. This is a cyclic subgroup of order 2, isomorphic to ${\mathbb{Z}/2\mathbb{Z}}$.

4. Example (NON-Example: subset alone insufficient). Consider modular arithmetic ${\mathbb{Z}/p\mathbb{Z}}$ for some prime ${p}$. This is a group consisting of a set ${\{0,1,\dots,p-1\}}$ equipped with addition modulo ${p}$, inversion maps ${m}$ to ${-m\equiv p-m\bmod{p}}$.

Although the underlying set is a subset of the additive group ${\mathbb{Z}}$, the addition operation is a different function. Consequently, the group structure on ${\mathbb{Z}/p\mathbb{Z}}$ is not equal to the group structure on ${\mathbb{Z}}$. The moral: being a subset alone is insufficient.

5. Example (Symmetric group). Let ${n\in\mathbb{N}}$. The symmetric group ${S_{n}}$ consists of permutations of the set ${\Omega_{n}=\{1,2,\dots,n\}}$. We can consider ${k\leq n}$, and then we have ${S_{k}\leq S_{n}}$ by restricting to permutations of the subset ${\Omega_{k}\subseteq\Omega_{n}}$.

6. Example. Recall the general linear group ${\mathrm{GL}(n,\mathbb{R})}$ is the group of ${n\times n}$ invertible matrices over the real numbers, using matrix multiplication as its binary operator, and matrix inversion for its group inverse. Also recall the determinant $$ \det\colon\mathrm{GL}(n,\mathbb{R})\to\mathbb{R}^{\times} \tag{1}$$ is a group morphism. Its kernel is a subgroup, called the Special Linear Group, denoted $$ \mathrm{SL}(n,\mathbb{R})=\{M\in\mathrm{GL}(n,\mathbb{R})|\det(M)=1\}. \tag{2}$$ The reader can check it is closed under matrix multiplication and matrix inversion.

7. Example. Let ${G}$ be any group. If ${g_{1},g_{2}\in G}$, then their commutator is the element ${[g_{1},g_{2}]=g_{1}g_{2}g_{1}^{-1}g_{2}^{-1}\in G}$. We can construct the Commutator Subgroup of ${G}$, denoted ${[G,G]}$ or (more commonly but more confusingly) by ${G'}$ in the literature, by considering the subgroup generated by commutators of elements of ${G}$. Then an arbitrary element of the commutator subgroup looks like $$ {} [g_{1},g_{2}][g_{3},g_{4}](\dots)[g_{2n-1},g_{2n}]\in[G,G] \tag{3}$$ and we use multiplication from ${G}$.

1. Exercises

Exercise 1. Let ${H_{1}}$, ${H_{2}\leq G}$. Is ${H_{1}\cap H_{2}}$ a subgroup of ${G}$?

Exercise 2. Let ${H_{1}}$, ${H_{2}\leq G}$. Is ${H_{1}\cup H_{2}}$ a subgroup of ${G}$?

Exercise 3. Let ${\varphi\colon G\to K}$ be a group morphism. Is the image of ${\varphi}$ a subgroup of ${K}$?

If ${H<K}$, is its preimage under ${\varphi}$ a subgroup of ${G}$?

Exercise 4. Prove or find a counter-example: if ${G}$ is a cyclic group, then every subgroup is cyclic.

2. Normal Subgroups

8. The special linear group is a rather special situation (no pun intended). We can prove that, for any ${M\in\mathrm{SL}(n,\mathbb{R})}$ and for any ${T\in\mathrm{GL}(n,\mathbb{R})}$, we have ${XMX^{-1}\in\mathrm{SL}(n,\mathbb{R})}$. Really? Look, take its determinant: $$ \begin{split} \det(XMX^{-1})&=\det(X)\det(M)\det(X)^{-1}\\ &=\det(M)=1. \end{split} \tag{4}$$ Its a property shared by the kernel of any group morphism ${\varphi\colon G\to H}$, we'd have ${k\in\ker(\varphi)}$ and ${g\in G}$ satisfy ${\varphi(gkg^{-1})=\varphi(g)\varphi(k)\varphi(g)^{-1}=\varphi(k)=e_{H}}$. This gives us a particular kind of subgroups.

9. Definition. Let ${G}$ be a group. A Normal Subgroup of ${G}$ consists of a subgroup ${H\leq G}$ such that

  1. Closed under conjugation by group elements: for any ${h\in H}$ and for any ${g\in G}$, we have ${ghg^{-1}\in H}$.

9.1. Remark. Care must be taken for the normality property: all we ask is for ${ghg^{-1}\in H}$, not that ${ghg^{-1}=h}$.

9.2. Remark (Notation). We denote a normal subgroup ${N}$ of ${G}$ by ${N \triangleleft G}$, and ${N \trianglelefteq G}$ if ${N}$ is a proper normal subgroup of ${G}$. There are a couple ways of remembering which way the triangle points: one is that we just turn the subgroup notation ${N\leq G}$ into a triangle, the other is that it "snitches" on (i.e., points to) the normal subgroup.

10. Example. The trivial subgroups are trivially normal.

11. Example. For the Dihedral group ${D_{n}}$, we have a couple subgroups (one generated by rotations, the other generated by reflections). Is one of them normal?

The rotation subgroup is generated by ${r}$ and satisfies ${r^{n}=1}$. The reflection subgroup is generated by ${s}$ and satisfies ${s^{2}=1}$. Together, they have ${s\circ r^{k}\circ s=r^{-k}}$. This implies the rotation subgroup is normal.

Is the reflection subgroup normal?

12. Theorem. Let ${G}$ be a group. If ${G}$ is Abelian, then every subgroup ${H\leq G}$ is normal.

Proof: Assume ${G}$ is Abelian. Let ${H\leq G}$ be an arbitrary subgroup of ${G}$, let ${g\in G}$ and ${h\in H}$ be arbitrary group elements. Then conjugation looks like ${g+h-g=h}$. Hence ${H}$ is a normal subgroup of ${G}$ by Definition 9. ∎

13. Example (Converse to Theorem 12 is false). Consider the quaternion group ${Q_{8}}$ generated by ${i^{4}=j^{4}=k^{4}=(ijk)^{2}=1}$ and so ${ij=k}$, ${jk=i}$, ${j=ki}$. We really have only 8 elements in ${Q_{8}}$: ${1}$, ${-1}$, ${i}$, ${-i}$, ${j}$, ${-j}$, ${k}$, ${-k}$. But we only need two of the "purely imaginary" elements (${i}$, ${j}$, ${k}$) to generate the entire group.

The proper subgroups would be generated from one complex generator, or from ${-1}$. We claim they are all normal. The case of ${\{\pm1\}<Q_{8}}$ is obviously normal, since ${x(-1)x^{-1}=-1}$ for any ${x\in Q_{8}}$.

The reasoning for proper subgroups generated by one imaginary element resembles one another, so let's consider the subgroup generated by ${i}$. Then we see $$ jij^{-1}=-jk=-i \tag{5}$$ and $$ j^{3}ij^{-3}=(-j)i(j)=i. \tag{6}$$ Similarly we'd find, from ${i^{3}=-i}$, that $$ ji^{3}j^{-1}=i \tag{7}$$ and $$ j^{3}i^{3}j^{-3}=-i. \tag{8}$$ Thus the subgroup generated by ${i}$ is closed under conjugation from all elements from ${Q_{8}}$.

Really? Well, any generic element in ${Q_{8}}$ looks like ${i^{m}j^{n}}$. Its inverse would be ${(i^{m}j^{n})^{-1}=j^{-n}i^{-m}}$. So conjugation by these elements would amount to multiplying the element ${i^{\ell}}$ from the subgroup by some ${i^{q}}$. But that's okay: elements of the form ${i^{q}}$ belong to the subgroup anyways.

14. Theorem. If ${\varphi\colon G\to H}$ is a group morphism, then ${\ker(\varphi)}$ is a normal subgroup of ${G}$.

The proof was sketched earlier in §8.

15. Theorem. Let ${G}$ be a group. Every normal subgroup ${N\triangleleft G}$ is the kernel of some group morphism.

15.1. Remark. We do not yet have the technology to prove this, yet, but it is true. Basically, we construct a morphism by taking ${g\in G}$ and mapping it to cosets ${gN=\{gn\in G|n\in N\}}$. The difficulty we have is that the image of this mapping is the collection of left cosets of ${N}$, which may or may not be a group. Further, it was rather arbitrary mapping it to left cosets of ${N}$: why not map ${g}$ to ${Ng=\{ng\in G|n\in N\}}$? Wouldn't these be distinct (i.e., not equal) mappings?

3. Exercises

Exercise 5. Find the normal subgroups of the symmetric group ${S_{n}}$.

Exercise 6. Let ${G}$ be a group, ${H\triangleleft G}$ be a normal subgroup, and ${N\triangleleft H}$ be a normal subgroup of ${H}$. Is ${N}$ a normal subgroup of ${G}$?

Exercise 7. Let ${G}$ be a group, recall the commutator subgroup from Example 7 that ${[G,G]}$ is a subgroup of ${G}$. Prove or find a counter-example: the commutator subgroup is a normal subgroup of ${G}$.

Exercise 8. Let ${\varphi\colon G\to H}$ be a group morphism, ${N\triangleleft G}$ be a normal subgroup. Is ${\varphi(N)}$ a normal subgroup of ${G}$? Consider the cases when ${\varphi}$ is injective, and when ${\varphi}$ is surjective.

Exercise 9. Let ${\varphi\colon G\to H}$ be a group morphism, and ${N_{H}\triangleleft G}$ be a normal subgroup. Is the preimage ${\varphi^{-1}(N_{H}) = \{g\in G|\varphi(g)\in N_{H}\}}$ a normal subgroup of ${G}$? Is it even a subgroup?

Saturday, November 27, 2021

ISO-2145 numbering in Plain TeX

I've been writing terse notes using something like the ISO-2145 standard. Could some TeX macros be written to support writing such notes? Of course!

Like amsart's subsections and subsubsections, I wanted my sections to be "runin style" (i.e., an unindented paragraph with a leading number, possibly with some label in bold, which is separated by vertical space from the previous paragraph). Following Knuth's WEB macros, I use \M to create a new unlabeled "section", and \N{label} to create a new labeled "section". These macros increment the section number, makes the section number (and label, if present) in bold, and does not indent the section's first paragraph.

But I wanted to allow arbitrary depths for "section numbers". What I did was treat the section number like a stack. When I wanted to work on a new subsection level, I call \M[1] (or, for labeled subsections, \N[1]{My label for the new subsection.}).

If I want to "ascend" back to the parent section, I call \M[-1] (and \N[-1]{...}) which will pop the section stack, then increment the top of the stack.

Should I want to move more than one level up or down, I simply call \M[n] (or \N[n]{...}) for any nonzero integer n. If called with n=0, then it's treated as \M (and \N{...}, resp.).

Retrospective

If I were to do this all over again, I would probably use different variable names (for example, I would probably use \chunkctr instead of \section for the counter name). But I just wanted to see if it was possible.

Also, this code could work in LaTeX, but needs some small modifications. (Well, mostly dealing with making \chunkctr a bona fide counter, so \refstepcounter could use its magic for referencing things properly.)

I guess if I wanted to be fully general, I would have some additional parameters to allow customizing the vertical spacing for separating the sections, the horizontal space between the section number and the next symbol, etc.

Code and Worked Example

\newcount\section
\section=0

% technically, sectionDepth is not needed, but it's to avoid popping off
% "too many sections"
\newcount\sectionDepth \sectionDepth=0

% \secprefix will be redefined to append the current section number
% followed by a period
\def\secprefix{}

% \pushSection simply appends "\the\section." to \secprefix's current value
\def\pushSection{
  \global\advance\sectionDepth by1
  \let\sectmp\secprefix
  \xdef\secprefix{\sectmp\the\section.}
}

% update \secprefix from "a.b.(...).c.d" to "a.b.(...).c"
% and set \section to "d" (as a number)
\def\popSection{
  \def\newPrefix{}
  \global\advance\sectionDepth by-1
  \def\updatePrefix##1.##2\endUpdate{
    \ifx\relax##2 % if we are at the last number
      % globally set \secprefix to be the leading (n-1) section numbers
      \xdef\secprefix{\newPrefix}
      % globally set \section to be the last section number from \secprefix
      \global\section=\number##1
    \else
      \xdef\newPrefix{\newPrefix##1.}
      \updatePrefix##2\endUpdate
    \fi}
  \ifx\secprefix\relax % if \secprefix is empty
  \else
    \expandafter\updatePrefix\secprefix\relax\endUpdate
  \fi
}

\def\thesection{\secprefix\the\section.}
\def\stepSec{\global\advance\section by1}

\def\downOne{
  \pushSection
  \global\section=0
}

\def\upOne{
  \popSection
}

\newcount\secIter \secIter=0

% \up ascends back to one of the parent sections, ascending #1 levels
\def\up#1{ % #1 is a negative integer
  \ifnum\sectionDepth>0%
    \secIter=#1%
    \multiply\secIter by-1
    % but \sectionDepth is a non-negative integer, so transform it
    \ifnum\secIter>\sectionDepth%
      \global\secIter=\sectionDepth
    \fi%
    \loop
      \ifnum\secIter>0
      \upOne
      \advance\secIter by-1
    \repeat%
  \fi%
}

% down appends #1 subsections to the given section number
\def\down#1{ % #1 is a positive integer
  \global\secIter=#1%
  \loop
    \ifnum\secIter>0
    \downOne
    \advance\secIter by-1
  \repeat%
}

% move up or down #1 layers, does nothing if #1 is zero
\def\adjustSection#1{\ifnum#1<0\up{#1}\else{\ifnum#1>0\down{#1}\fi}\fi}

% \M is used for unlabeled paragraphs
\def\Mlabel{\medbreak\noindent\ignorespaces%
  {\bf\thesection\ignorespaces}\enspace\ignorespaces}
\def\Mnone{\stepSec\Mlabel}
\def\Mmany[#1]{\adjustSection{#1}\Mnone}
\def\xM{\ifx\tmp[\expandafter\Mmany\else\Mnone\fi}
\def\M{\futurelet\tmp\xM}

% \N is used for labeled paragraphs
\def\Nlabel#1{\medbreak\noindent\ignorespaces%
  {\bf\thesection\ignorespaces\enspace#1\ignorespaces}\enspace\ignorespaces}
\def\Nnone#1{\stepSec\Nlabel{#1}}
\def\Nmany[#1]#2{\adjustSection{#1}\Nnone{#2}}
\def\xN{\ifx\tmp[\expandafter\Nmany\else\Nnone\fi}
\def\N{\futurelet\tmp\xN}

\M
This is a test to see if this works.

\M I hope it works.

\M[1]
And this should be \S{2.1}, I hope.

\N[1]{Claim:} This should be \S{2.1.1}.

\M[-1]
This should be \S{2.2}.

\M[3]
This should be \S{2.2.0.0.1}, I hope?

\N[-3]{Theorem.}%
This is \S3, yes?

\M[-1]
No, this is \S3.

\bye

Addendum: in LaTeX

It turns out to be a straightforward generalization to LaTeX:

\documentclass{article}
\usepackage{fullpage}
\newcounter{chunk}

\def\setSection#1{\setcounter{chunk}{#1}}
\def\stepSec{\refstepcounter{chunk}}

% technically, sectionDepth is not needed, but it's to avoid popping off
% "too many sections"
\newcount\sectionDepth \sectionDepth=0

% \secprefix will be redefined to append the current section number
% followed by a period
\def\secprefix{}
\renewcommand\thechunk{\secprefix\arabic{chunk}.}

% \pushSection simply appends "\arabic{chunk}." to \secprefix's current value
\def\pushSection{
  \global\advance\sectionDepth by1
  \let\sectmp\secprefix
  \xdef\secprefix{\sectmp\arabic{chunk}.}
}

% update \secprefix from "a.b.(...).c.d" to "a.b.(...).c"
% and set \chunk counter to "d" (as a number)
\def\popSection{
  \def\newPrefix{}
  \global\advance\sectionDepth by-1
  \def\updatePrefix##1.##2\endUpdate{
    \ifx\relax##2 % if we are at the last number
      % globally set \secprefix to be the leading (n-1) section numbers
      \xdef\secprefix{\newPrefix}
      % globally set \chunk to be the last section number from \secprefix
      \setSection{\number##1}
    \else
      \xdef\newPrefix{\newPrefix##1.}
      \updatePrefix##2\endUpdate
    \fi}
  \ifx\secprefix\relax % if \secprefix is empty
  \else
    \expandafter\updatePrefix\secprefix\relax\endUpdate
  \fi
}

\def\downOne{\pushSection\setSection{0}}

\def\upOne{\popSection}

\newcount\secIter \secIter=0

% \up ascends back to one of the parent sections, ascending #1 levels
\def\up#1{ % #1 is a negative integer
  \ifnum\sectionDepth>0%
    \secIter=#1%
    % but \sectionDepth is a non-negative integer, so transform it
    \multiply\secIter by-1
    \ifnum\secIter>\sectionDepth%
      \global\secIter=\sectionDepth
    \fi%
    \loop
      \ifnum\secIter>0
      \upOne
      \advance\secIter by-1
    \repeat%
  \fi%
}

% down appends #1 subsections to the given section number
\def\down#1{ % #1 is a positive integer
  \global\secIter=#1%
  \loop
    \ifnum\secIter>0
    \downOne
    \advance\secIter by-1
  \repeat%
}
\def\adjustSection#1{\ifnum#1<0\up{#1}\else{\ifnum#1>0\down{#1}\fi}\fi}

%% \M is used for unlabeled paragraphs
\newcommand\M[1][0]{\adjustSection{#1}\medbreak%
  \stepSec\noindent\ignorespaces%
  \textbf{\thechunk\ignorespaces}\enspace\ignorespaces}

%% \N is used for labeled paragraphs
\newcommand\N[2][0]{\adjustSection{#1}\medbreak%
  \stepSec\noindent\ignorespaces%
  \textbf{\thechunk\ignorespaces\enspace#2\ignorespaces}\enspace\ignorespaces}

\begin{document}

\M
This is a test to see if this works.

\M I hope it works.

\M[1]
And this should be \S{2.1}, I hope.

\N[1]{Claim:} This should be \S{2.1.1}.

\M[-1]
This should be \S{2.2}.

\M[3]
This should be \S{2.2.0.0.1}, I hope?

\N[-3]{Theorem.}%
This is \S3, yes?

\M[-1]
No, this is \S3.

\M[0] This should be \S4.

\N[0]{Corollary.} This is \S5?
\end{document}

Wednesday, November 24, 2021

Mathematical Language and Soft Type Systems

1. After writing some thoughts about the next generation of provers, I'd like to draw your attention to a couple papers:

  1. F. Wiedijk,
    "Mizar's Soft Type System."
    In: K. Schneider & J. Brandt (eds.), Theorem Proving in Higher Order Logics 2007, LNCS 4732, 383-399, 2007; pdf
  2. Cezary Kaliszyk, Florian Rabe,
    "A Survey of Languages for Formalizing Mathematics".
    arXiv:2005.12876, 19 pages

I mention these papers as worth reading, for a couple reasons: first, they discuss an example of what a "higher level of abstraction" actually looks like; second, Kaliszyk and Rabe's paper complements Avigad's paper quite well.

2. Definition. Recall that type theory works by making typing judgements part of the metalanguage. A Soft Type System introduces typing judgements as propositions (and types as predicates) in the object language.

2.1. Example. For a concrete example of soft types in predicate logic, a dependent type with $n$ parameters is represented as an $n+1$-ary predicate $T(x, c_{1}, \dots, c_{n})$ interpreted as $x : T~c_{1}\dots c_{n}$. Subtyping is interpreted as $\forall x, S(x)\implies T(x)$ to assert $S$ is a subtype of $T$, and so on.

2.2. Remark (Soft Types in Type Theory). This is not limited to "non-type theoretic foundations". For example C-Corn in Coq uses a version of a soft type system, as a means to increase the level of abstraction.

2.3. I bring this up because soft types increase the level of abstraction for formalizing mathematics. It's a non-obvious step, which approximates mathematical practice, and demonstrates the sort of "extensibility" I think the next generation of provers may need. Amazingly enough, it can encode the axiom of choice (see, e.g., slides about the [type] usage in Mizar for encoding choice).

3. Mathematical Language. The only addendum I have to add from my last post is just a reference to Kaliszyk and Rabe's paper. It works in a "dual" manner to Avigad's paper, trying to think about a "pseudocode" for formalizing mathematics by surveying what different provers do.

The only qualm I have with their paper is precisely addressed by Avigad's paper: it's useful to give a lot of "small snippets" when trying to think about designing a language. Working in the abstract is, well, abstract. And hard.