vagrant share – sharing a vagrant launched headless VM service on the public interwebz

Lest I forget (which I had…):

vagrant share lets you launch a VM using vagrant and share the environment using ngrok in three ways:

  • via public URLs (expose your http ports to the web, rather than locally);
  • via ssh;
  • via vagrant connect (connect to any exposed VM port from a remote location).

So this could be handy for remote support with students… If we tell them to install the vagrant share plugin, then we can offer remote support…

Tinkering With Neo4j and Cypher

I am so bored of tech at the moment — I just wish I could pluck up the courage to go into the garden and start working on it again (it was, after all, one of the reasons for buying the house we’ve been in for several years now, and months go by without me setting foot into it; for the third year in a row the apples and pears have gone to rot, except for the ones the neighbours go scrumping for…) Instead, I sit all day, every day, in front of a screen, hacking at a keyboard… and I f*****g hate it…

Anyway… here’s some of the stuff that I’ve been playing with yesterday and today, in part prompted by a tweet doing the rounds again on:

#Software #Analytics with #Jupyter notebooks using a prefilled #Neo4j database running on #MyBinder by @softvisresearch
Created with building blocks from @feststelltaste and @psychemedia
#knowledgegraph #softwaredevelopment



Anyway… It prompted me to revisit my binder-neo4j repo that demos how to launch a neo4j database in a MyBinder container tp provide some more baby steps ways in to actually getting started running queries.

So yesterday I added in a third party cypher kernel to the build, HelgeCPH/cypher_kernel that lets you write cypher queries in code cells; and today I hacked together some simple magic — innovationOUtside/cypher_magic — that lets you write cypher queries in block magic cells in a “normal” (python kernel) notebook. This magic really should be extended a bit more eg to allow connections to arbitrary neo4j databases, and perhaps crib from the cypher_kernel to include graph conversions to a networkx graph object format as well as graphical vidusalisations.

The cypher-kernel uses visjs, as does an earlier cypher magic that appears to have rotted (ipython-cypher). But if we can get the graph objects into a nx format, then we could also use netwulf to make pretty diagrams…

The tweet-linked repo also looks interesting (although I don’t speak German at all, so, erm…); there may be things I can also pull out of there to add to my binder-neo4j repo, although I may need to rethink that: the binder-neo4j repo had started out as a minimal template repo for just getting started with neo4j in MyBinder/repo2docker. But it’s started creeping… Maybe I should pare it back again, install the magic from its own repo, and but the demos in a more disposable place.

Sketches Around Transkribus – Handwritten Text Transcriptions in Jupyter Notebooks

Another strike day not reading as much as I’d hoped, I auto-distracted by having a play with the Transkribus Python API. (Transkribus, if you recall, is an app that supports transcription of hand-written texts.)

The API lets you pull (and push, but I haven’t got that far yet) documents from and to the Transkribus webservice. One of the docs you can export from it (which is also available from the GUI client) is an XML doc that includes co-ordinates for segmented line regions within each page:

You can export the document from the GUI…

…but there’s a Python API way of doing it too…

So, I made a few sketches (in a notebook in this gist) that started to explore the API, including pulling the XML down, along with page images, parsing it, and using OpenCV to crop individual text lines out of the page image scan.

I then popped a function together to create a simple markdown file containing each cropped line and any trasncript already added to it:

My thinking here is that I could use Jupytext to open the markdown document in a notebook interface and add further transcription text to a markdown doc / notebook containing separate text lines. There’s a Python API call for pushing stuff back to the server, so I hoping I should be able to come up with a simple script to transform the markdown, or perhaps even notebook ipynb/JSON derived using Jupytext from it, to the required XML format and push it back to the Transkribus server.

(You can see where I’m going here, perhaps? A simple notebook UI as an alternative to the more complex Transkribus UI.)

The next step, though, is to see if I can get the Transkribus service to find the text lines on a new page in a document already uploaded to the service and then pull the corresponding XML down; then see if I can upload a document to the service. (I also need to have a go at creating a document collection.) Then I’ll be able to thing a bit more about generating the XML I need to push a new, or updated, transcript back to the Transkribus service.

I should probably also try getting a config to run this in MyBinder, and working on a reproducible demo (the sketch uses a document I’ve uploaded and partially trasncribed, and I’m not sure how to go about sharing it, if indeed I can?)

Sketches Around The National Archives

I had intended to spend strike week giving my hands a rest, reading rather than keyboarding, but as it was I spent today code-sketching around the National Archives, as well as other things.

In trying to track down original Home Office papers relating to the Yorkshire Luddites, I’ve been poking around the National Archives (as described in passing here). Over the last couple of years, I’ve grown weary of search interfaces, even Advanced Search ones, preferring to try to grab the data into my own database(s) where I can more easily query and enrich it, as well as join it with other data sources.

I had assumed the National Archives search index was a bit richer than it is (I put down my lack of success in many searches I tried to unfamiliarlity with it) but it seems pretty thin – an index catalogue that indexes the existence of document collections but not what’s in them to any great level of detail.

But assuming there was rather more detail than I seem to have found, I did a few code sketches around it that demonstrate:

  • using mechanicalsoup to load a search page, set form selections, “click” a download burron and capture the result;
  • using StringIO to load CSV data into a pandas dataframe;
  • using spacy to annotate a data frame with named entities;
  • exploding lists in a data-frame column to make a long dataframe therefrom;
  • expanding a column of tuples in a dataframe across several columns;
  • using Wand (an Python API for imagemagick) to render pages from a PDF as images in a Jupyter notebook (Chrome is borked again, not rendering PDFs via a notebook IFrame).

Check the gist to see the code… (Bits of it should run in MyBinder too – just remember to select “Gist”! spacy isn’t installed at the moment — Gists seem to be a bit broken at the moment, the requirements.txt file is being mistreated, and I donlt want to risk breaking other bits as a side effect of trying to fix it. If Gists are other than temporarily borked, I will try to remember to add the code within this post explicilty.)

In Search of Rebellion – Tracking Down the Luddites Whilst On Strike…

At the monthly Island Storytellers session last week, the theme being “rebellion”, I clumsily told a two-part tale of the Yorkshire Luddites, the first part on the machine breaking activities of March and April, 1812, the second covering the murder of William Horsfall in April of that year, and the consequent Special Commission in York in January 1813.

Though there was way too much for the telling (and notwithstanding I’m still trying to find a storytelling voice), it’s helped me bed in some of the names and dates, certainly enough to start pulling together a set of stories around that should give me a tale of rebellion or murder should ever I need one again…

I’ve also started collecting books on the subject and, given this week is a strike week, an opportunity to start trying to find my way into the archives.

There are also lots of things to crib from…

Dave Pattern’s most excellent Huddersfield Exposed website contains a wealth of scanned resources on all matters relating to the history of Huddersfield, including several relating to the Luddites, including a scan of the second edition of Peel’s Risings of the Luddites, Chartists and Plugdrawers, Cowgill’s Historical Account of the Luddites of 1811, 1812 and 1813 , and, in direct response to a request for the same (thanks, Dave :-), a scan of the Proceedings at York — Special Commission 1813 on which the two aforementioned works draw heavily.

Rather more specifically, the Luddite Centenary blog is just an amazingly comprehensive retelling, recording on a daily basis, diary/calendar style, the unfolding history of 200 years before. Whilst some of the posts include literal transcripts of historical documents, many re-present the events in a more narrative way, albeit with pointers into the historical record. I’m trying to get hold of an export version of this site because it’d be a wonderful thing to try to pull into a database and run a named entity tagger over, for example, but I may just scrape it on the side too… erm… fair dealing in terms of personal research?!

There are also several notable books out there to add colour and background, as well as relevant context and critique around the social, political and economic conditions of the time, and I’ll be making my way through those over the strike period (and probably beyond): Darvall’s Popular Disturbances and Public Order in Regency England” (based on his PhD thesis), and Hammond & Hammond’s The Skilled Labourer, 1760-1832, for example. (Another, more recent, PhD thesis that looks relevant for dipping into the archives is Bend’s 2018 thesis, The Home Office and public disturbance, c. 1800-1832.)

As far as original documents go, the Home Office archives HO 40/1 The Luddite riots – reports and HO 40/2 The Luddite riots – military reports are where I’m starting, and which are perhaps most immediately relevant. (Additional Home Office records can be found via the National Archives here, or search the National Archives using references of the form HO-42-19.) I can’t quite thoile getting Writings of the Luddites just now, but I’m intrigued as to what’s in it… (Please feel free to buy and ship me a copy from my wishlist… Or anything else from on there, for that matter, it being Christmas upcoming and all that…;-)

There are probably lots of other Home Office collections that contain relevant stuff, but I’ll be relying on secondary sources to give me rather more targeted initial hooks into those…

One of the things I am finding quite tricky is actually reading the handwritten script (palaeographer, I am not..). Someone has obviously read the originals in compiling posts on the Luddite Centenary blog, but I haven’t been able to find the original transcripts anywhere.

One thing I have been using to support my own transcription (using the Luddite Centenary posts as a crib, I have to admit!) is Transkribus, which I found via a British Library site, I think, an EU funded project that provides a cross-platform app for supporting the transcription of hand-written docs. (I had considered trying to build my own tool chain, but this was much easier!)

The app itself provides, out of the can, the ability to identify lines of handwriting and then you can provide your own transcription against the line:

The application can also try to do script2text conversions. There are some built in models available, but they didn’t seem to work so well. The idea seems to be more that you provide your own transcribed documents and when you have 15k words / 50 pages or so ready to go, you request permission to train a model on that; but that will take me some time to get to!

(I am hoping to bootstrap at some point, getting a model that can start to help with making transcriptions at least, providing a crude draft I could then work from to correct…)

Transkribus also allows you to tag certain elements, but I don’t think the tags, which would presumably be used as the basis for training a named entity tagger, are used for anything much at the moment. Still, it makes sense to tag-as-you-go, I guess!

At the moment I’m still in very early days, and my reading is not that fast. I have started wondering about models based on particular correspondents, such as Joseph Radcliffe, Justice of the Peace in Huddersfield. I’m not sure if the CC00727 – SIR JOSEPH RADCLIFFE OF HUDDERSFIELD, LUDDITE RECORDS ON MICROFILM (MIC:5) held by the West Yorkshire Archive Service contain papers written by Radcliffe as well as ones sent to him, but if they do it might be interesting to to try to get a digitised copy of them and run them through against the model…

One thing I have found slightly trickier than I’d expected is tracking down both Parliamentary papers and Parliamentary Acts. The UK Parliament Parliamentary Archives sends you off to a commercial Proquest database (subscription required; I presumably do have academic access, but: a) I’m on strike, so using my credentials would be crossing the picket like; b) other people aren’t so privileged).

What is irksome is that I can download a scanned copy of the pages from Google going from Google Books, to trying to read the book (not necessarily successfully — I don’t have cookies set…) on Google Play, which adds it to my Google Play library:

and from where I can download it as a PDF…

A quick way into the Google Book pages for the Parliamentary Papers can be found here, Britain, Parliamentary Papers on the Post Office, Sessions 1810 – 1819. (Similar links aren’t on the Parliamentary Archive pages, perhaps because the Google scans… well… Google… Their rapacious and flagrant disregard for copyright is handy, sometimes…)

Accessing Parliamentary debates is possible via a hacky API. For example, the Frame Work Bill, which you can find being introduced in Journal of the House of Commons, Volume 67 (1812), p.116. (again, via Google) can be tracked, if you browse enough pages, through the following debates:

and so on…

Finding the Act, once passed, becomes another matter. In the case of the 1812 Frame Breaking Act, which is to say, 1812: 52 George 3. c.16: The Frame-Breaking Act, or more fully “An Act for the more exemplary Punishment of Persons destroying or injuring any Stocking or Lace Frames, or other Machines or Engines used in the Framework knitted Manufactory, or any Articles or Goods in such Frames or Machines”, a transcription is available via The Statutes Project —, which itself got the transcription from the Luddite Bicentenary website, but that is far from comprehensive in terms of complete transcriptions.

However, the The Statutes Project does also provide a chronological list of UK Statutes which links, again, to Google Book scanned versions of the statute books (example). And again, PDFs can be downloaded.

A couple of other notable Acts are the Unlawful Oaths Act (May 1812, 52 Geo. III c. 104) and the Watch and Ward Act, (March 1812, 52 Geo. III c. 17), aka the Nottingham Peace Act, aka the Preservation of the Peace Act. By the by, I note a locally published copy of this act on the Calderdale “From Weaver to Web” Visual Archive website.

For a list of the actual acts by name, Wikipedia seems most convenient: List of Acts of the Parliament of the United Kingdom, 1801–1819.

As a break from the reading, I’ve also started to track down related things to listen to and watch… For example, The Luddite Lament, a BBC radio programme from 2011, now on BBC Sounds, provides an interesting take on the Luddite times from the songs that commemorate it.

Finding songs otherwise is pretty tricky (I’m still trying to figure out how to do anything useful on the Vaughan Williams Memorial Library website!). There are some transcribed here and there’s at least one on the Luddite Bicentenary site: The Hand-Loom Weavers’ Lament. There are also a couple on the Yorkshire Garland Group Yorkshire folk song website, specifically: Foster’s Mill and The Cropper Lads.

Telly wise, there’s a Thames TV drama documentary from 1988 on The Luddites (available here but the Sophos spyware IT installed on my machine tries to block this site; it’s also on Youtube, so once again, Google’s disregard for all things copyright, except when it suits them, is handy…). It reminded me of Culloden, taking a documentary style approach as if it were recorded at the time. There’s also a Granada TV series from 1967, Inheritance (catchphrase: “there’s trouble at t’mill”), based on a novel of the same name by Phyllis Bentley. I’m waiting for a secondhand copy of the book to arrive, but haven’t tracked down the video…

PS Just as an aside, the Luddite history also acts as a useful branching point into other stories. For example, during the attack on Cartwright’s Rawfolds Mill, two Luddites died (“justifiable homicide”, no trial necessary) and two others were suspected to have died shortly therafter. In the days following the attack, a local parson lodging at Lousy Farm (now Thorn Bush Farm) in Liversedge, near to his church, St Peter’s, Hartshead-cum-Clifton [map], from whence this legend comes, was passing the church in the early hours of the morning. He heard a disturbance, and noticed several men secretly buring someone in the south-east corner of the graveyard. Knowing of the recent action, and further that there had been no recent burials in that part of the graveyard — the men were not graverobbers — he did not intervene but carried on his way. The Parson, who had been appointed to a curacy at All Saints, Dewsbury, in December, 1809, and thence to St Peter’s in in March, 1811, had originally hailed from Ireland under the name Patrick Brunty. Upon taking a place at St John’s College, Cambridge, in October, 1802, he had changed his surname, aged 25, to Brontë. He was later to marry and have several children, including a daughter whose second novel, “Shirley”, published in 1849, was set in, and around, the Spen Valley. The novel fictionalised the Luddite times, though several historical figures are recognisable within it. That daughter’s first novel, “Jane Eyre”, had previously garnered good reviews; her name, as you may already have guessed, was Charlotte. Her father’s tales of life in and around St Peter’s had surely (doh!) informed that tale…

On Strikes and Publishing…

Being a member of the union, I’m on strike for as long as it lasts. One of the grounds for the strike is manageable workloads, so I was rather surprised to be asked yesterday evening (erm… evening…;-) to comment on the final version / revisions in light of reviewers’ comments, of a paper I’m named on that needs to be returned before the strike is over.

My formal academic publishing record is so poor I guess I shouldn’t begrudge any opportunity to get entered into the REF, but there’s a but…

One of the issues I have with academic publishing is the relationship between academia and the publishing industry. The labour and intellectual property rights are gifted by academics and academic institutions to the publishers, then the academic institutions pay the publishers to access the content.

As an employee of a university, my contract has something to say about intellectual property rights; I’m also pretty sure I’m not allowed to enter the institution into legally binding contracts. However, it’s par for the course for academics to sign over intellectual property rights in the form of copyright to academic publishers. (I’ve never really been convinced they/we are legally entitled to do so?)

But that’s not the issue here. Strikes are intended to cause disruption to the activities of the organisation the strikers are employed by. We’re on strike. Partly over workloads. Universities benefit from their academics publishing in academic journals in a variety of ways (and yes, I do know I’ve not played my part in this for years, ever since a researcher on a temporary contract I was publishing with was let go; IIRC, I offered 10% of my salary, 20% if needed be, to help keep them on till we managed to find some funding, even though internal money was around at the time; it would have been in my interest, academically speaking and career progression wise…).

So… the strike is an opportunity to raise concerns through causing disruption.

One of the current strike concerns is workload. Universities either value academic publishing or they don’t. If they do, providing time in work time to publish is part of that contract. On the other hand, an academic makes themselves more employable by having a better publishing record, so using strike time on “personal brand boosting” academic publishing gives the academic power when it comes to personal negotiations with the academy, for example over salary grading, or when threatening to leave. (Many universities, I think, can suddenly find a Chair to offer to someone who has been offered a Chair elsewhere in an attempt to retain them…)

But if workload is a legitimate issue, then engaging in an activity that an institution may sideline on the grounds that they know the academic will use their own personal time, including strike time, to pursue, seems counter to the strike’s concerns?

Academic publishers and conferences may actually benefit from the strike too, in terms of time being freed up by strike action for such activity (Lorna Campbell posted eloquently on a related dilemma yesterday in terms of what to do regarding attendance of events taking place during, but booked prior to, strike action being called: Where to draw the line?).

Whilst the strike is directed at the employers rather than the publishers, when it comes to workload, surely the way the employer-publisher complex is organised is part of the problem? So should the strike not also be directed at the publishers? If journal issues or conference plans are disrupted, isn’t that part of the point? (And yes, I do know: many academic conferences are organised by academics; I used to organise workshop sessions myself; but some also have a commercial element…)

Another of the issues the union keeps returning to is the question of pensions. Academic authors, signing away as they do intellectual property rights that may be theirs, or may be their employers, also sign away pension pin money in the form of royalties they don’t otherwise receive.

Whilst teaching myself R a few years ago, I kept notes and published them as a self-published book on Leanpub. The royalties from it only ever trickled in, but they cover my Dropbox and WRC+ subscription costs and buy me the odd ticket to go and see the touring cars or historics. At the time, I started sketching out how many self-published books I’d need to eke out a living on; I had enough blog posts on Gephi, OpenRefine and various data journalism recipes to be able to pull a couple of manuals together in quite quick time, but figured I’d probably need to crank out a quick manual every couple of months to make a go of it and rely on organic sales without engaging in any marketing activity.

One of the struggles I have with strikes is knowing how to spend my time whilst on strike given that I am supposed to remain available for work, and then deliberately withdraw my labour, rather than take the time as a de facto holiday. Idly wondering about what the point of the strike is, and what it’s supposed to achieve, is part of the strike action I take (as I realise from previous posts on strike days, such as On (“)Strike(“) <- once again, WordPress misbehaves…).

And one thing this post has got me wondering about is: should academics go on strike against the publishers?

PS thinks: one of the purposes of strike disruption is to get folk who may be being disrupted but who sympathise with your cause to help lobby on your behalf. If academic strikes against employers also mean not supplying publishers, the publishers may then also start to lobby the employers on behalf of the striking academics becuase they don't want their businesses disrupted… Hmm.. Strange bedfellows… My enemy's enemy is my friend…

PPS Double thinks: not publishing affects the REF, so by not using strike time to get ahead with a research paper, you put more pressure on the organisation who feels its REF returns may get hit? Rather than using the the stike time to potentially improve the institution's REF return? (And yes, I know: as well as your own… But strikes do involve self-sacrfice; that's also part of the point: that you are willing to do something that may cause you short-term harm on the way to improving conditions for everyone in the longer term.)

On (Not) Working With Open Source Software Packages

An aside observation on working with open source software packages (which I benefit from on a daily basis. The following is not intended as a particular criticism, it’s me reflecting on things I think I’ve spotted and which may help me contribute back more effectively.)

There are probably lots of ways of slicing and dicing how folk engage with open source projects, but I’m going to cut it this way:

  • maintainer;
  • contributor;
  • interested user.

The maintainer owns the repo and has the ultimate say; a contributor is someone who provides pull requests (PRs) and as such, tries to contribute code in; an interested user is someone who uses the package and knows the repo exists…

The maintainer is ultimately responsible for whether PRs are accepted.

I generally class myself as an interested user; if I find a problem, I try to raise a sensible issue; I also probably abuse issues by chipping in feature requests or asking support questions that may be better asked on Stack Overflow or within a project’s chat community or forums if it has them. (The problem with the latter is that sometimes they can be hard to find, sometimes they require sign on / auth; if I submit an issue to them, it’s also yet another place I need to keep track of to look for replies.)

On occasion, I do come up with code fragements that I share back into issues; on rare occasions, I make PRs.

The reasons I don’t step up more to “contributor” level are severalfold:

  • my code sucks;
  • I have a style problem…
    • I don’t use linters, though this is something I need to address;
    • I don’t really know how to run a linter properly over a codebase;
  • I don’t know how to:
    a) write tests;
    b) write tests properly;
    c) run tests over a codebase.
  • I don’t read documentation as thoroughly as perhaps I should…

Essentially, my software engineering skills suck. And yes, I know this is something I could / should work on, but I am really habituated to my own bad practice, stream-of-consciousness coding style…

One of the things I have noticed about stepping up is that is can be hard to step-up all the way, particularly in projects where the software engineering standards of the maintainer are enforced by the maintainer, and the contributors‘ contributions (for whatever reason: lack of time; lack of knowledge; lack of skills) don’t meet those standards.

What this means is that PRs that work for the contributor but don’t meet the standards of the maintainer, and the PR just sits, unaccepted, for months or years.

For the interested user, if they want the functionality of the PR, they may then be forced into using the fork created by the contributor.

However, a downside of this is that the PR may have been created by the contributor to fix an immediate does, does the job they need at the time, they use it, and move on, but as a goodwill gesture chip the PR in.

In such a case, the contributor may not have a long time commitment to the package (they may just have needed for a one off) so the overhead of building in tests that integrate well with the current test suite may be an additioanl overhead. (You could argue that they should have written tests anyway, but if it was a one off they may have been coding fast and using a “does it work”: metric as an implicit test on just the situation they needed to code to work in. Which raises another issue: a contributor may need code to work in a special case, but the maintainer needs it to work in the general case.)

For the contributor who just wanted to get something working, ensuring that the code style meets the maintainer’s standards is another overhead.

The commitment of the contributor to the project (and by that, I also mean their commitment in the sense of using the package regularly rather than as a one off, or perhaps more subtly, their commitment to using the package regularly and their PR regularly) perhaps has an impact on whether they value the PR actually making it into master. If they are likley to use the feature regularly, it’s in their interest to see it get into the main codebase. If they use it as a one off, or only regularly, their original PR may suffice. A downside of this is that over time, the code in the PR may well start to lag behind that of code in master. Which can cause a problem for a user who wants to use the latest master features and the niche feature (implemented off a now deprecated master) in the PR.

For the contributor, they may also not want to have to continue to maintain their contribution, and the maintainer may well have the same feeling: they’re happy to include the code but don’t necessarily want to have to maintain it, or even build on it (one good reason for writing packages that support plugin mechanisms, maybe? Extensions are maintained outside the core project and plugged in as required.)

By the by, a couple of examples that illustrate this if I return to this idea and try to pick it apart a bit further and test it against actual projects (I’m not intending to be critical about either the packages or the project participants; I use both these packages and value them highly; they just flag up issues I notice as a user):

  • integrating OpenSheetMusic (a javascript music score viewer that is ideal for rendering sheet music in Jupyter notebooks) into music21; an issue resulted in code that made it as far as a PR that was rejected, iterated on, but still fails a couple of minor checks…
  • hiding the display of a code cell in documentation generated by nbsphinx. There are several related issues (for example, this one, which refers to a couple of others) and two PRs, one of which has been sitting there for three years…

Now it may be that in the above case, the issues are both niche and relate to enabling or opening up ways of using the original packages that go beyond the original project’s mission, and the PRs are perhaps ways of the contributor co-opting the package to do something it wasn’t originally intended to do.

For example, the OpenSheetMusic display PR is really powerful for users wanting to use music21 in a Jupyter notebook, but this may be an environment that the current package community doesn’t use. Whilst the PR may make the package more likely to be used by notebook users and grow the community, it’s not core to the current community. (TBH, I haven’t really looked at how the music21 package has been used: a) at all, b) in the notebook community, for the last year or so. The lack of OpenSheetMusic support has been one reason why I drifted away from looking at music packages…)

In the case of nbsphinx which was perhaps developed as a documentation production tool, and as such benefits code always being displayed, the ability to hide input cells makes it really useful as a tool for publishing pages where the code is used to generate assets that are displayed in the page, but the means of production of those assets does not need to be shown. For example, a page that embeds a map generated from code: the intention is to publish the map, not show the code what demonstrates how to produce the map. (Note: hiding input can work in three ways: a) the input is completely removed from the published doc; b) the input is in the doc, but commented out, so it is not displayed in the rendered form; c) the code is hidden in the rendered form but can also be revealed.)

In both the above cases, I wonder whether the PR going outside the current community’s needs provides one of the reasons why the PRs don’t get integrated? For example, the PR might open the package to a community that doesn’t currently use the package, by enabling a necessary feature required by that new community. The original community may see the new use as “out-of-scope”, but under this lens we might ask: is there a question of territoriality in play? (“This package is not for that…”)