Read-it Later / Page Archiving

Summary
Being able to make a snapshot of a page to:

  • Read it later when offline
  • Access the original page when site goes down
  • Have Annotations still being anchored, even if site changes

Question to the community:

  1. What is the reason you’d like to have this feature? What is the workflows you’d like to enable for you?
  2. How would you like this feature to be implemented?
  3. What are good examples of other services that do this feature well?
  4. Which formats for exporting the archives are important to you?
  1. It’s a much cleaner note experience to be able to view what I viewed previously from within the app, without having to go to a whole new jarring page, orient myself, and hope nothing changed. The purpose of using something like this is that links and sites change, and I want to be able to refer to the thing I originally looked at.
  2. I’d like to have a seperate “search” and “view” section, where I’m searching for things on the left then it gets loaded into the “view” window on the right when I click on it, similarly to evernote.
4 Likes
  1. I would like this feature like stated before because it’d be nice to get the information I’m looking for without having to leave the “app.” Similarly, if I’m on a mobile device I’d like to get this info without having to open a browser, which may incur much more network load. Finally, because I travel quite a bit, and sometimes connectivity isn’t possible, but being able to view a local cache would be wonderful.
  2. I just need text in 90% of cases. A single page with html, with consistent styling would be just fine. I’m thinking “reader mode” in Pocket for everything. keep most HTML tags, but worry much less about actual styling.
  3. I think Pocket does this well. I’ve considered paying for pocket for this reason, but would love to auto-archive things with specific rules (such as an amount of time spent on the page).
  4. exporting in specific formats are not important to me, but keeping it open is important. I want to be able to, in a pinch, just open the archive in a vim session to extract the info I need. It’s fine if it’s compressed first, or something like that.
3 Likes
  1. Link rot has already taken a major bite out of my bookmarks, and content does change, so archiving a page can save me time down the road. In addition, a “proof of existence” feature can help with a variety of things, especially in a legal context.

  2. An option to automatically archive any page when an annotation is made on it.

  3. SingleFile seems to work well and incorporates proof of existence. I have just started to test it. https://github.com/gildas-lormeau/SingleFile

  4. No preference so long as its thorough. SingleFile’s seems adequate.

4 Likes

Thank you all for your input here, already very helpful

Now I am happy to let you know that we are starting the design research to make this feature happen :tada:
We’ve prepared a survey to learn more about your use case and get inspired from what is out there already.

It would be great if you took a few minutes to help us out!

1 Like

I tried SingleFile (good service!) and stumbled on a problem that Memex 2 (a browser extension that it is), cannot open a html file from disk.

So I made a script to publish the saved htmls from SingleFile to a simple static site

It is the same as Pinboard.in’s web archiving, but currently memex has a bug that cannot highlight nor summon tooltip on <iframe> page.

1 Like

What is Memex’s policy on cross-browser support? Chrome has the pageCapture API for extensions that saves the page as MHTML. MHTML is imperfect (sometimes doesn’t look right), and it has the limitation that you can only load it as a render from a file:// scheme. But it has the advantage that it is already implemented.

Derek_Dwilson’s suggestion #2 would be great for my purposes:

  1. An option to automatically archive any page when an annotation is made on it.

I would add that I’d like to be able to save an archive any time that I do anything to the page, tag/highlight/annotate etc. Otherwise the full-search feature works for me.

It would still be nice to be able to view the full text of all other pages, though, since they’re available to memex somewhere.

1 Like

Thanks @hopula & @ybbond for some more ideas on how to implement this.

We first start with the “Reader” where you have a slim version of a page you can save up.

Indeed we will give some more settings for this.
These are the mockups. Do they fit for your use cases?

I only see a single screenshot of the settings pages–is that the mockup you’re referring to?

Can you summarize what a reader mode means? Is it text only, or is it also other markup (like images, if images can be saved offline)?

I would prefer the ability to only save images if a page is bookmarked/highlighted/tagged/listed. As I read that settings panel, it seems as if reader mode is basically the same thing as already happens, except that now it is viewable, not just searchable.

Since images will add a considerable amount of size, naively I think I would like to be able to view the full-text (ie see the reader version) of all pages, but see images only in the pages I was especially interested in (ie tagged etc).

I only see a single screenshot of the settings pages–is that the mockup you’re referring to?

Yep what I wanted to show you the settings we have in mind and if those cover the use cases you have.

Can you summarize what a reader mode means? Is it text only, or is it also other markup (like images, if images can be saved offline)?

in the first version we only do text, and load the images when people have a connection.
We do so because indeed this can become quite data heavy if we need to also sync all images.
Hence we roll this out with care.
Your suggestion to “but see images only in the pages I was especially interested in (ie tagged etc)” is a good one indeed to make sure not too many pages are synced.

it seems as if reader mode is basically the same thing as already happens, except that now it is viewable, not just searchable.

Yes exactly. The reader adds the ability to view pages you visited, instead of just searching them.
Did you expect something differently?

from my perspective, the reader view is a good approach. my previous concern was related to this: Archiving & Full Text Search Won't Work for 301 Redirected Page (e.g. Medium), I think that issue already fixed :+1: