Skip to content

Latest commit

 

History

History
169 lines (86 loc) · 3.78 KB

instructions.md

File metadata and controls

169 lines (86 loc) · 3.78 KB

THIS IS FOR EDUCATION PURPOSE ONLY.


Steps

If you have a better way to accomplish the same task with more ease, please create an issue or send a PR on this file and add your steps/instructions below.


Command-line

Recommended to use Cmder or Windows terminal with WSL if you are on Windows.


Copying site

Tools that helped in the process: Wget

Run on command-line:


wget --mirror --page-requisites --adjust-extension --convert-links -e robots=off https://example.com

or

wget -m -p -E -k -e robot=off https://example.com



Filtering Assets (PNG, JPG, SVG, GIF, WEBP)

Tools that helped in the process: Everything

  • Go to the path where you copied the website earlier using Wget and filter by .png, .jpg, ...

  • Select all, right-click and click on Copy Full Name to Clipboard - This will copy the absolute path of files.



VSCode arranging content

Tools that helped in the process: Visual Studio Code, paste

  • Create a new file with any extension like .md and paste the earlier copied content from Everything and name it as path.md for example.


  • Replace the path till ..\developer.android.com with https://developer.android.com


  • Replace \ with /

  • Create a new file called title.md, image.md in the same directory as path.md

  • Copy contents of path.md in title.md

  • Click on regex mode in VSCode and enter .*/ in the field and replace all - This will remove all the content from the string until the last /



  • Enter .png and replace it with an empty string to get the title.


  • Enter regex ^ and replace with #### to get a nice heading.


  • Go to command-line and Merge title.md and path.md with paste -d "^" title.md path.md > tp.md


  • Perform regex: $(?<!\.png)(?<!\.gif)(?<!\.svg)(?<!\.jpg)(?<!\.webp)(\n) on tp.md and replace with empty string - This will select all lines that do not end with .png, .gif, .svg, .jpg, .webp and remove their line breaks.


  • Copy contents in path.md and paste in image.md

  • Replace https: with ![](https: and Replace .png with .png) - This will make the image render in markdown.



  • Go to command-line and Merge tp.md and image.md with paste -d \n tp.md image.md > tpi.md


  • In tpi.md perform Regex \^ to replace ^ with Ctrl+Enter (line break)


  • Perform regex ^https: to replace https: with Source: https: - This will only replace the https: at the beginning of the line.


  • Replace #### with <br> {Ctrl+Enter} #### - Press {Ctrl + Enter} to give line break.


  • Done. This will produce the same output as in the repository.