-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Speaking as an Editor that has had to write or help with a few explainers, I find them to be an inefficient use of the WG's and the Editor's time. They effectively re-state much of what is in the specification and in a way that is not generally useful to anyone but possibly a handful of people that could have gotten the same information out of reading the use cases, charter, or introduction of the specification. They are also dropped on the floor and not updated until the next time we need to engage with the TAG or horizontal review process (document rot).
Most recently, I tried an experiment, which was to use ChatGPT's "Research Mode" to ingest the TAG's "Writing Effective Explainers", ingest the DID v1.1 specification, and generate the explainer for the v1.1 review. I then went through, line by line, to edit the result to ensure that it reflected what was requested in "Writing Effective Explainers" as well as the current interpretation of the specification, which it did. That process took about an hour, whereas writing an explainer, at least when I do it, typically takes multiple days (because it takes a lot of effort to condense a 40+ page specification down to 4 pages or so).
The result was this document: https://github.com/w3c/did/blob/main/EXPLAINER.md, which is better than the initial explainer written (and not updated for years). The TAG ended up not really reading that document, and then expected the explainer to contain things that were not requested in the "Writing Effective Explainers" document. These were the comments of concern:
I did not read it thoroughly since it was a general explanation of DID 1.1
An explainer is required for a horizontal review, so why are we required to produce things that won't be read? Was it because it was generated by an LLM? Even though it was edited by an Editor? In this particular case, the reviewer had a solid understanding of DIDs... but not reading the explainer is something I've seen happen in other reviews. What was more concerning was this exchange:
I suggest we discourage LLMs for writing explainers.
I think the way they did it is one of the least offenisve ways it can be done - Manu reviewed it. The 'alternatives considered' section doesn't match what we'd expect, but this is an OK overview of DIDs in general.
I've been reading about energy use and it seems to be that just using them doesn't use the main share of the energy; it's training them.
Using an LLM for this task saved me A LOT of time, and it produced output that was probably on par with what I would have generated anyway. I was expecting the output to be largely useless, but was pleasantly surprised that the LLM gave me something that was largely workable with a single (human) editorial pass that took 30 minutes or so. It was so successful, that I am probably going to use an LLM to write all future explainers, and want to be up-front with the TAG and all the other horizontal review boards as well (as I was this time by clearly marking the Explainer as having been generated by an LLM).
However, I'd go a step further -- instead of writing explainers, the TAG should probably ensure that the introductory portions of the specification covers the general things that are covered in an explainer, and if it doesn't, the specification should be fixed. Other items should probably just be questions in the horizontal review issue that is raised.
Having Editors generate explainers is an anti-pattern -- the spec should do what an explainer does in the introductory portions of the spec, and if it doesn't do that, those portions of the specification need to be changed. This would result in easier to read specifications at W3C and reduce the amount of time that Editors spend writing single use documents for the TAG that are not maintained.