Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not replaying XHR POST request from legacy collection #861

Closed
despens opened this issue Aug 25, 2023 · 2 comments
Closed

Not replaying XHR POST request from legacy collection #861

despens opened this issue Aug 25, 2023 · 2 comments

Comments

@despens
Copy link

despens commented Aug 25, 2023

Somewhere before pywb version 2.3.1 a particular POST stopped replaying. It is related to old instagram captures.

At the moment, the web archive cannot be replayed in pywb 2.3.1 and 2.7.4, but works in Conifer.

Steps to reproduce the bug

The same web archive is available at 3 different deployments

Status deployment pywb URL
Conifer 2.5.0 https://conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552/http://instagram.com/amaliaulman
Webenact 2.3.1 https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman
Rhizome Webarchives 2.7.4 https://webarchives.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman

Click on any of the instagram photos from the grid to trigger the POST request, for instance this one:
Screenshot from 2023-08-25 12-53-17

HAR logs

Conifer
{
  "log": {
    "version": "1.2",
    "creator": {
      "name": "Firefox",
      "version": "117.0"
    },
    "browser": {
      "name": "Firefox",
      "version": "117.0"
    },
    "pages": [
      {
        "id": "page_1",
        "pageTimings": {
          "onContentLoad": -533266,
          "onLoad": -533261
        },
        "startedDateTime": "2023-08-25T12:54:24.168+02:00",
        "title": "https://conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552/http://instagram.com/amaliaulman"
      }
    ],
    "entries": [
      {
        "startedDateTime": "2023-08-25T12:54:24.168+02:00",
        "request": {
          "bodySize": 596,
          "method": "POST",
          "url": "https://cones.conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552mp_/http://instagram.com/query/",
          "httpVersion": "HTTP/2",
          "headers": [
            {
              "name": "Host",
              "value": "cones.conifer.rhizome.org"
            },
            {
              "name": "User-Agent",
              "value": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/117.0"
            },
            {
              "name": "Accept",
              "value": "application/json, text/javascript, */*; q=0.01"
            },
            {
              "name": "Accept-Language",
              "value": "en-US,en;q=0.7,de-DE;q=0.3"
            },
            {
              "name": "Accept-Encoding",
              "value": "gzip, deflate, br"
            },
            {
              "name": "X-Pywb-Requested-With",
              "value": "XMLHttpRequest"
            },
            {
              "name": "Content-Type",
              "value": "application/x-www-form-urlencoded; charset=UTF-8"
            },
            {
              "name": "X-Instagram-AJAX",
              "value": "1"
            },
            {
              "name": "X-CSRFToken",
              "value": "deleted"
            },
            {
              "name": "X-Requested-With",
              "value": "XMLHttpRequest"
            },
            {
              "name": "Content-Length",
              "value": "596"
            },
            {
              "name": "Origin",
              "value": "https://cones.conifer.rhizome.org"
            },
            {
              "name": "Connection",
              "value": "keep-alive"
            },
            {
              "name": "Referer",
              "value": "https://cones.conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552mp_/http://instagram.com/p/s67XD2FV5l/?modal=true"
            },
            {
              "name": "Cookie",
              "value": "__wr_sesh=WyJhVlNoekY2dWRSK1UxZUxnZ0xTcURyNHlCaDA9IixmYWxzZV0.ZOiGUw.C9L1H2HfgzeqRKN95kVzpLrgCW0"
            },
            {
              "name": "Sec-Fetch-Dest",
              "value": "empty"
            },
            {
              "name": "Sec-Fetch-Mode",
              "value": "cors"
            },
            {
              "name": "Sec-Fetch-Site",
              "value": "same-origin"
            },
            {
              "name": "Pragma",
              "value": "no-cache"
            },
            {
              "name": "Cache-Control",
              "value": "no-cache"
            }
          ],
          "cookies": [
            {
              "name": "__wr_sesh",
              "value": "WyJhVlNoekY2dWRSK1UxZUxnZ0xTcURyNHlCaDA9IixmYWxzZV0.ZOiGUw.C9L1H2HfgzeqRKN95kVzpLrgCW0"
            }
          ],
          "queryString": [],
          "headersSize": 1061,
          "postData": {
            "mimeType": "application/x-www-form-urlencoded",
            "params": [
              {
                "name": "q",
                "value": "ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, WB_wombat_location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}"
              }
            ],
            "text": "q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D"
          }
        },
        "response": {
          "status": 200,
          "statusText": "OK",
          "httpVersion": "HTTP/2",
          "headers": [
            {
              "name": "server",
              "value": "nginx/1.17.9"
            },
            {
              "name": "date",
              "value": "Fri, 25 Aug 2023 10:54:24 GMT"
            },
            {
              "name": "content-type",
              "value": "application/json"
            },
            {
              "name": "x-archive-orig-cache-control",
              "value": "private, no-cache, no-store, must-revalidate"
            },
            {
              "name": "x-archive-orig-content-encoding",
              "value": "gzip"
            },
            {
              "name": "content-language",
              "value": "en"
            },
            {
              "name": "x-archive-orig-date",
              "value": "Tue, 14 Oct 2014 16:29:19 GMT"
            },
            {
              "name": "x-archive-orig-expires",
              "value": "Sat, 01 Jan 2000 00:00:00 GMT"
            },
            {
              "name": "x-archive-orig-pragma",
              "value": "no-cache"
            },
            {
              "name": "x-archive-orig-server",
              "value": "nginx"
            },
            {
              "name": "set-cookie",
              "value": "csrftoken=deleted; Path=/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552mp_/http://instagram.com/"
            },
            {
              "name": "x-archive-orig-vary",
              "value": "Cookie, Accept-Language, Accept-Encoding"
            },
            {
              "name": "x-archive-orig-content-length",
              "value": "909"
            },
            {
              "name": "x-archive-orig-connection",
              "value": "keep-alive"
            },
            {
              "name": "content-security-policy",
              "value": "default-src 'unsafe-eval' 'unsafe-inline' 'self' data: blob: mediastream: ws: wss: conifer.rhizome.org/_set_session; form-action 'self'"
            },
            {
              "name": "strict-transport-security",
              "value": "max-age=31536000"
            },
            {
              "name": "X-Firefox-Spdy",
              "value": "h2"
            }
          ],
          "cookies": [
            {
              "name": "csrftoken",
              "value": "deleted"
            }
          ],
          "content": {
            "mimeType": "application/json",
            "size": 3048,
            "text": "{\"status\":\"ok\",\"code\":\"s67XD2FV5l\",\"shared_by_author\":true,\"usertags\":{\"nodes\":[]},\"owner\":{\"username\":\"amaliaulman\",\"requested_by_viewer\":false,\"followed_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-a.ak.instagram.com\\/hphotos-ak-xaf1\\/10724811_680031252095464_259338633_a.jpg\",\"id\":\"202871366\",\"is_private\":false},\"comments\":{\"nodes\":[{\"text\":\"black and white roses\",\"viewer_can_delete\":false,\"id\":\"809233848265432742\",\"user\":{\"username\":\"yiming2014\",\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_1027541052_75sq_1393581182.jpg\"}},{\"text\":\"\\ud83d\\udc4f\\ud83d\\udc4f\\ud83d\\udc9f\",\"viewer_can_delete\":false,\"id\":\"809380233082134177\",\"user\":{\"username\":\"annasoldner\",\"profile_pic_url\":\"http:\\/\\/photos-f.ak.instagram.com\\/hphotos-ak-xfa1\\/10643885_239528626220653_744988609_a.jpg\"}}]},\"caption\":\"\\ud83d\\udc99\",\"likes\":{\"count\":130,\"viewer_has_liked\":false,\"nodes\":[{\"user\":{\"username\":\"michellerawlings\",\"profile_pic_url\":\"http:\\/\\/photos-h.ak.instagram.com\\/hphotos-ak-xap1\\/10362315_509364185856223_1822410738_a.jpg\"}},{\"user\":{\"username\":\"oscarsaurus_rex\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-d.ak.instagram.com\\/hphotos-ak-xaf1\\/10610974_756432981085155_2039221624_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"leahschrager\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_1280121550_75sq_1399326744.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"hannahthoughts\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_423257681_75sq_1371768205.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"marilynschneider\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-c.ak.instagram.com\\/hphotos-ak-xpf1\\/10349626_239240692932746_1425483956_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"tictactoy\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_37452681_75sq_1372076997.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"chisenhalegallery\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-d.ak.instagram.com\\/hphotos-ak-xpa1\\/10467860_310275525815555_931978272_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"ninandkris\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-a.ak.instagram.com\\/hphotos-ak-xaf1\\/10617006_1475234602725696_1208035371_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"booksandwine_\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-h.ak.instagram.com\\/hphotos-ak-xaf1\\/10661281_360850720743983_1963563015_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"simsimsakhai\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_1154100479_75sq_1394153158.jpg\",\"followed_by_viewer\":false}}]},\"date\":1410686581.0,\"is_video\":false,\"id\":\"809220152487140965\",\"display_src\":\"http:\\/\\/photos-g.ak.instagram.com\\/hphotos-ak-xaf1\\/10608019_719252001482622_659224595_n.jpg\"}"
          },
          "redirectURL": "",
          "headersSize": 926,
          "bodySize": 3974
        },
        "cache": {},
        "timings": {
          "blocked": 0,
          "dns": 1,
          "connect": 180,
          "ssl": 200,
          "send": 0,
          "wait": 286,
          "receive": 0
        },
        "time": 667,
        "_securityState": "secure",
        "serverIPAddress": "54.164.112.170",
        "connection": "443",
        "pageref": "page_1"
      }
    ]
  }
}
Webenact
{
  "log": {
    "version": "1.2",
    "creator": {
      "name": "Firefox",
      "version": "117.0"
    },
    "browser": {
      "name": "Firefox",
      "version": "117.0"
    },
    "pages": [
      {
        "id": "page_1",
        "pageTimings": {
          "onContentLoad": -146122,
          "onLoad": -143630
        },
        "startedDateTime": "2023-08-25T13:03:01.894+02:00",
        "title": "https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman"
      }
    ],
    "entries": [
      {
        "startedDateTime": "2023-08-25T13:03:01.894+02:00",
        "request": {
          "bodySize": 596,
          "method": "POST",
          "url": "https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/query/",
          "httpVersion": "HTTP/1.1",
          "headers": [
            {
              "name": "Host",
              "value": "webenact.rhizome.org"
            },
            {
              "name": "User-Agent",
              "value": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/117.0"
            },
            {
              "name": "Accept",
              "value": "application/json, text/javascript, */*; q=0.01"
            },
            {
              "name": "Accept-Language",
              "value": "en-US,en;q=0.7,de-DE;q=0.3"
            },
            {
              "name": "Accept-Encoding",
              "value": "gzip, deflate, br"
            },
            {
              "name": "X-Pywb-Requested-With",
              "value": "XMLHttpRequest"
            },
            {
              "name": "Content-Type",
              "value": "application/x-www-form-urlencoded; charset=UTF-8"
            },
            {
              "name": "X-Instagram-AJAX",
              "value": "1"
            },
            {
              "name": "X-CSRFToken",
              "value": "deleted"
            },
            {
              "name": "X-Requested-With",
              "value": "XMLHttpRequest"
            },
            {
              "name": "Content-Length",
              "value": "596"
            },
            {
              "name": "Origin",
              "value": "https://webenact.rhizome.org"
            },
            {
              "name": "Connection",
              "value": "keep-alive"
            },
            {
              "name": "Referer",
              "value": "https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/p/s67XD2FV5l/?modal=true"
            },
            {
              "name": "Sec-Fetch-Dest",
              "value": "empty"
            },
            {
              "name": "Sec-Fetch-Mode",
              "value": "cors"
            },
            {
              "name": "Sec-Fetch-Site",
              "value": "same-origin"
            },
            {
              "name": "Pragma",
              "value": "no-cache"
            },
            {
              "name": "Cache-Control",
              "value": "no-cache"
            }
          ],
          "cookies": [],
          "queryString": [],
          "headersSize": 854,
          "postData": {
            "mimeType": "application/x-www-form-urlencoded",
            "params": [
              {
                "name": "q",
                "value": "ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, WB_wombat_location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}"
              }
            ],
            "text": "q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D"
          }
        },
        "response": {
          "status": 404,
          "statusText": "Not Found",
          "httpVersion": "HTTP/1.1",
          "headers": [
            {
              "name": "Server",
              "value": "nginx/1.14.0 (Ubuntu)"
            },
            {
              "name": "Date",
              "value": "Fri, 25 Aug 2023 11:03:02 GMT"
            },
            {
              "name": "Content-Type",
              "value": "text/html"
            },
            {
              "name": "Transfer-Encoding",
              "value": "chunked"
            },
            {
              "name": "Connection",
              "value": "keep-alive"
            },
            {
              "name": "Content-Encoding",
              "value": "gzip"
            }
          ],
          "cookies": [],
          "content": {
            "mimeType": "text/html",
            "size": 720,
            "text": "<!DOCTYPE html>\n<html>\n<head>\n    <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8;charset=utf-8\">\n    <title>URL Not Found</title>\n    <link rel=\"stylesheet\" href=\"https://webenact.rhizome.org/static/css/bootstrap.min.css\">\n    <script src=\"https://webenact.rhizome.org/static/js/jquery-latest.min.js\"></script>\n    <script src=\"https://webenact.rhizome.org/static/js/bootstrap.min.js\"></script>\n</head>\n<body>\n<div class=\"container\">\n    <div class=\"row justify-content-center\">\n        <h2 class=\"display-2\">URL Not Found</h2>\n        <p class=\"lead\">\n            The url <b>http://instagram.com/query/</b> could not be found in this collection.\n        </p>\n            </div>\n</div>\n</body>\n</html>"
          },
          "redirectURL": "",
          "headersSize": 195,
          "bodySize": 583
        },
        "cache": {},
        "timings": {
          "blocked": 0,
          "dns": 2,
          "connect": 111,
          "ssl": 117,
          "send": 0,
          "wait": 124,
          "receive": 0
        },
        "time": 354,
        "_securityState": "secure",
        "serverIPAddress": "35.245.250.198",
        "connection": "443",
        "pageref": "page_1"
      }
    ]
  }
}
Rhizome Webarchives
{
  "log": {
    "version": "1.2",
    "creator": {
      "name": "Firefox",
      "version": "117.0"
    },
    "browser": {
      "name": "Firefox",
      "version": "117.0"
    },
    "pages": [
      {
        "id": "page_1",
        "pageTimings": {
          "onContentLoad": -10519,
          "onLoad": -6630
        },
        "startedDateTime": "2023-08-25T13:05:11.629+02:00",
        "title": "https://webarchives.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman"
      }
    ],
    "entries": [
      {
        "startedDateTime": "2023-08-25T13:05:11.629+02:00",
        "request": {
          "bodySize": 586,
          "method": "POST",
          "url": "https://webarchives.rhizome.org/excellences-and-perfections/20141014150552mp_/http://instagram.com/query/",
          "httpVersion": "HTTP/1.1",
          "headers": [
            {
              "name": "Host",
              "value": "webarchives.rhizome.org"
            },
            {
              "name": "User-Agent",
              "value": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/117.0"
            },
            {
              "name": "Accept",
              "value": "application/json, text/javascript, */*; q=0.01"
            },
            {
              "name": "Accept-Language",
              "value": "en-US,en;q=0.7,de-DE;q=0.3"
            },
            {
              "name": "Accept-Encoding",
              "value": "gzip, deflate, br"
            },
            {
              "name": "X-Pywb-Requested-With",
              "value": "XMLHttpRequest"
            },
            {
              "name": "Content-Type",
              "value": "application/x-www-form-urlencoded; charset=UTF-8"
            },
            {
              "name": "X-Instagram-AJAX",
              "value": "1"
            },
            {
              "name": "X-CSRFToken",
              "value": "deleted"
            },
            {
              "name": "X-Requested-With",
              "value": "XMLHttpRequest"
            },
            {
              "name": "Content-Length",
              "value": "586"
            },
            {
              "name": "Origin",
              "value": "https://webarchives.rhizome.org"
            },
            {
              "name": "Connection",
              "value": "keep-alive"
            },
            {
              "name": "Referer",
              "value": "https://webarchives.rhizome.org/excellences-and-perfections/20141014150552mp_/http://instagram.com/p/s67XD2FV5l/?modal=true"
            },
            {
              "name": "Cookie",
              "value": "csrftoken=deleted; __utma=1.164060185.1413299153.1413299153.1413299153.1; __utmb=1.1.10.1413299153; __utmc=1; __utmz=1.1413299153.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt=1"
            },
            {
              "name": "Sec-Fetch-Dest",
              "value": "empty"
            },
            {
              "name": "Sec-Fetch-Mode",
              "value": "cors"
            },
            {
              "name": "Sec-Fetch-Site",
              "value": "same-origin"
            },
            {
              "name": "Pragma",
              "value": "no-cache"
            },
            {
              "name": "Cache-Control",
              "value": "no-cache"
            }
          ],
          "cookies": [
            {
              "name": "csrftoken",
              "value": "deleted"
            },
            {
              "name": "__utma",
              "value": "1.164060185.1413299153.1413299153.1413299153.1"
            },
            {
              "name": "__utmb",
              "value": "1.1.10.1413299153"
            },
            {
              "name": "__utmc",
              "value": "1"
            },
            {
              "name": "__utmz",
              "value": "1.1413299153.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)"
            },
            {
              "name": "__utmt",
              "value": "1"
            }
          ],
          "queryString": [],
          "headersSize": 1068,
          "postData": {
            "mimeType": "application/x-www-form-urlencoded",
            "params": [
              {
                "name": "q",
                "value": "ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}"
              }
            ],
            "text": "q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D"
          }
        },
        "response": {
          "status": 404,
          "statusText": "Not Found",
          "httpVersion": "HTTP/1.1",
          "headers": [
            {
              "name": "Server",
              "value": "nginx/1.18.0 (Ubuntu)"
            },
            {
              "name": "Date",
              "value": "Fri, 25 Aug 2023 11:05:11 GMT"
            },
            {
              "name": "Content-Type",
              "value": "text/html"
            },
            {
              "name": "Transfer-Encoding",
              "value": "chunked"
            },
            {
              "name": "Connection",
              "value": "keep-alive"
            },
            {
              "name": "Content-Encoding",
              "value": "gzip"
            }
          ],
          "cookies": [],
          "content": {
            "mimeType": "text/html",
            "size": 1093,
            "text": "<!DOCTYPE html>\n<html lang=\"en\">\n    <head>\n        <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8;charset=utf-8\"/>\n        <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n\n        <title>URL Not Found</title>\n\n        <link rel=\"stylesheet\" href=\"https://webarchives.rhizome.org/static/css/bootstrap.min.css\"/>\n<link rel=\"stylesheet\" href=\"https://webarchives.rhizome.org/static/css/font-awesome.min.css\">\n<link rel=\"stylesheet\" href=\"https://webarchives.rhizome.org/static/css/base.css\">\n\n<script src=\"https://webarchives.rhizome.org/static/js/jquery-latest.min.js\"></script>\n<script src=\"https://webarchives.rhizome.org/static/js/bootstrap.min.js\"></script>\n                            </head>\n\n    <body>\n                <header>\n   </header>\n        \n        <section>\n        <div class=\"container\">\n    <div class=\"col-12\">\n        <h4>URL Not Found</h4>\n    </div>\n    <p>\n        The url <b>http://instagram.com/query/</b> could not be found in this collection.    </p>\n    </div>\n        </section>\n\n                            </body>\n</html>"
          },
          "redirectURL": "",
          "headersSize": 195,
          "bodySize": 668
        },
        "cache": {},
        "timings": {
          "blocked": 0,
          "dns": 0,
          "connect": 0,
          "ssl": 0,
          "send": 0,
          "wait": 124,
          "receive": 0
        },
        "time": 124,
        "_securityState": "secure",
        "serverIPAddress": "35.236.219.133",
        "connection": "443",
        "pageref": "page_1"
      }
    ]
  }
}

Archive data

WARC
WARC/1.0
WARC-Type: response
WARC-Record-ID: <urn:uuid:de2f8e02-78fa-4727-bab5-20f60c540299>
WARC-Date: 2014-10-14T16:29:23Z
WARC-Target-URI: http://instagram.com/query/
WARC-IP-Address: 54.236.170.22
Content-Type: application/http;msgtype=response
Content-Length: 1366
WARC-Block-Digest: sha1:MX6AUYZCF6D5ADTOBAXKWCIP7Q4UPWJT
WARC-Payload-Digest: sha1:EQP327EORTS2QTCXGOS54KCDO6X47LBB

HTTP/1.1 200 OK
Cache-Control: private, no-cache, no-store, must-revalidate
Content-Encoding: gzip
Content-Language: en
Content-Type: application/json
Date: Tue, 14 Oct 2014 16:29:19 GMT
Expires: Sat, 01 Jan 2000 00:00:00 GMT
Pragma: no-cache
Server: nginx
Set-Cookie: csrftoken=deleted; expires=Tue, 13-Oct-2015 16:29:19 GMT; Max-Age=31449600; Path=/
Vary: Cookie, Accept-Language, Accept-Encoding
Content-Length: 909
Connection: keep-alive

<binary_data>

WARC/1.0
WARC-Type: request
WARC-Record-ID: <urn:uuid:e70f652f-f262-4fd8-8950-a9fcd1b6260a>
WARC-Date: 2014-10-14T16:29:23Z
WARC-Target-URI: http://instagram.com/query/
WARC-Concurrent-To: <urn:uuid:de2f8e02-78fa-4727-bab5-20f60c540299>
WARC-Block-Digest: sha1:KKIVWVB3DRWYSW2WI2UROXAP7UP6BQ7W
Content-Type: application/http;msgtype=request
Content-Length: 1334

POST /query/ HTTP/1.1
x-csrftoken: deleted
content-length: 596
accept-language: en-US,en;q=0.5
accept-encoding: gzip, deflate
referer: http://instagram.com/p/ruEFWFlV10/?modal=true
x-instagram-ajax: 1
accept: application/json, text/javascript, */*; q=0.01
user-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:32.0) Gecko/20100101 Firefox/32.0
connection: keep-alive
cookie: pywb.timestamp=20141014162919; csrftoken=deleted; mid=VD1PMgAEAAF9EQNZDMyG22AWoP1a; __utma=deleted; __utmb=deleted; __utmc=deleted; __utmz=deleted; __utmt=deleted; pywb_timestamp=deleted
pragma: no-cache
cache-control: no-cache
host: instagram.com
x-requested-with: XMLHttpRequest
content-type: application/x-www-form-urlencoded; charset=UTF-8

q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D
CDXj
com,instagram)/query?q=ig_shortcode(s67xd2fv5l)%20{%20id,%20code,%20owner%20{%20id,%20username,%20is_private,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20},is_video,%20video_url,%20shared_by_author,%20date,%20display_src,%20location%20{%20name%20},%20caption,%20usertags%20{%20nodes%20{%20user%20{%20username%20},%20position%20}},%20likes%20{%20count,%20viewer_has_liked,%20nodes%20{%20user%20{%20username,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20}}},%20comments.last(20)%20{%20nodes%20{%20id,%20user%20{%20username,%20profile_pic_url%20},%20text,%20viewer_can_delete%20}}} 20141014162923 {"url":"http://instagram.com/query/","mime":"application/json","status":"200","digest":"EQP327EORTS2QTCXGOS54KCDO6X47LBB","length":"1656","offset":"11394396","filename":"excellences-and-perfections_desktop-p3.warc.gz"}

Testing with GET

The resource is always available when queried via GET:

Conifer: https://conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/http://instagram.com/query/?q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D

Webenact: https://webenact.rhizome.org/excellences-and-perfections/http://instagram.com/query/?q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D

Rhizome Webarchives: https://webarchives.rhizome.org/excellences-and-perfections/http://instagram.com/query/?q=ig_shortcode(s67XD2FV5l)+{+id%2C+code%2C+owner+{+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+}%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+{+name+}%2C+caption%2C+usertags+{+nodes+{+user+{+username+}%2C+position+}}%2C+likes+{+count%2C+viewer_has_liked%2C+nodes+{+user+{+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+}}}%2C+comments.last(20)+{+nodes+{+id%2C+user+{+username%2C+profile_pic_url+}%2C+text%2C+viewer_can_delete+}}}

Summary

This seems to be a change in pywb behavior that breaks replay for some POST requests.

cc @m4rk3r @mona-ul

@mona-ul
Copy link

mona-ul commented Nov 8, 2023

We found the solution for the issue: We simply had to reindex (wb-manager reindex) the collection.
The collection is now fully functional with pywb 2.7.4 on Rhizome Webarchives: https://webarchives.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman

The previous index.cdxj for the collection was created in May 2019, and it was compatible with the pywb version at that time. And it seems that somewhere along the way, the POST Requests were handled differently by newer pywb versions, causing a mismatch between pywb and the existing index. This mismatch resulted in the malfunction of the web archive.

Is this analysis correct?

Index entry of Post Request, Mai 2019, pywb version unknown
com,instagram)/query?q=ig_shortcode(s67xd2fv5l)%20{%20id,%20code,%20owner%20{%20id,%20username,%20is_private,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20},is_video,%20video_url,%20shared_by_author,%20date,%20display_src,%20location%20{%20name%20},%20caption,%20usertags%20{%20nodes%20{%20user%20{%20username%20},%20position%20}},%20likes%20{%20count,%20viewer_has_liked,%20nodes%20{%20user%20{%20username,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20}}},%20comments.last(20)%20{%20nodes%20{%20id,%20user%20{%20username,%20profile_pic_url%20},%20text,%20viewer_can_delete%20}}} 20141014162923 {"url":"http://instagram.com/query/","mime":"application/json","status":"200","digest":"EQP327EORTS2QTCXGOS54KCDO6X47LBB","length":"1656","offset":"11394396","filename":"excellences-and-perfections_desktop-p3.warc.gz"}

Index entry of Post Request, Nov 2023, pywb version 2.7.4 - with "requestBody"
com,instagram)/query?__wb_method=post&q=ig_shortcode(s67xd2fv5l)%20{%20id,%20code,%20owner%20{%20id,%20username,%20is_private,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20},is_video,%20video_url,%20shared_by_author,%20date,%20display_src,%20location%20{%20name%20},%20caption,%20usertags%20{%20nodes%20{%20user%20{%20username%20},%20position%20}},%20likes%20{%20count,%20viewer_has_liked,%20nodes%20{%20user%20{%20username,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20}}},%20comments.last(20)%20{%20nodes%20{%20id,%20user%20{%20username,%20profile_pic_url%20},%20text,%20viewer_can_delete%20}}} 20141014162923 {"url": "http://instagram.com/query/", "mime": "application/json", "status": "200", "digest": "EQP327EORTS2QTCXGOS54KCDO6X47LBB", "length": "1656", "offset": "11394396", "method": "POST", "requestBody": "q=ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, WB_wombat_location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}", "filename": "excellences-and-perfections_desktop-p3.warc.gz"}

@despens despens changed the title Not replaying XHR POST request Not replaying XHR POST request from legacy collection Nov 8, 2023
@despens
Copy link
Author

despens commented May 29, 2024

This issue was solved by re-indexing the collection.

@despens despens closed this as completed May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants