Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests failing on Windows in the GH Workflow CI #650

Open
ibnesayeed opened this issue May 18, 2020 · 7 comments
Open

Tests failing on Windows in the GH Workflow CI #650

ibnesayeed opened this issue May 18, 2020 · 7 comments

Comments

@ibnesayeed
Copy link
Member

During the recent implementation of GH Workflow CI #648 for cross-platform matrix testing we found that tests were failing on Windows and hanging the process indefinitely. Once such failed attempts can be seen in a previous CI log (expand the skipped "Run Tests" step for details). This shows some exceptions there, which need to be investigated on a Windows machine manually.

The possibility of the failure coming from the newly created IPFS Setup Action unlikely because tests there are passing in all platforms, both for CLI and API.

@machawk1
Copy link
Member

machawk1 commented Jul 3, 2020

I am able to replicate a very long delay on a Windows 10 VM. Using latest master (4fca791):

  • the first test-backends fails
  • all other test-backends pass
  • all test-compile-uri pass
  • all test-indexing pass
  • the first test-memento fails after a short delay.
  • the second test in test-memento takes a long time without seeming to end.

For this, I am running the ipfs daemon separate (v0.6.0).

If I do not start the daemon separately and just run py -m pytest ./:

  • the first and fourth test-backends fails
  • both test-indexing fail
  • all test-memento fail pretty quickly
  • ...and the testing proceeds.
Output of second run
                        if socket_options is not None:
                                for opt in socket_options:
                                        sock.setsockopt(*opt)
                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                    sock.connect(sa)
                    return sock
            except OSError as e:
                    err = e
                    if sock is not None:
                            sock.close()
                            sock = None

    if err is not None:
          raise err

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:66:


address = ('localhost', 5001), timeout = <object object at 0x00986BF0>, source_address = None
socket_options = [(6, 1, 1)], family = <AddressFamily.AF_UNSPEC: 0>

def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
                      source_address=None, socket_options=None,
                      family=socket.AF_UNSPEC):
    host, port = address
    if host.startswith('['):
            host = host.strip('[]')
    err = None

    if not family or family == socket.AF_UNSPEC:
            family = urllib3.util.connection.allowed_gai_family()

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                    sock = socket.socket(af, socktype, proto)

                    # If provided, set socket level options before connecting.
                    if socket_options is not None:
                            for opt in socket_options:
                                    sock.setsockopt(*opt)

                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                  sock.connect(sa)

E ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:57: ConnectionRefusedError

During handling of the above exception, another exception occurred:

hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
payload = b'Memento for 2/2/2013 10:00am\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return
          httpHeaderIPFSHash = pushBytesToIPFS(hstr)

ipwb\indexer.py:80:


bytes = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'

def pushBytesToIPFS(bytes):
    """
    Call the IPFS API to add the byte string to IPFS.
    When IPFS returns a hash, return this to the caller
    """
    global IPFS_API

    # Returns unicode in py2.7, str in py3.7
    try:
      res = IPFS_API.add_bytes(bytes)  # bytes)

ipwb\indexer.py:383:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding')
kwargs = {}

@wraps(cmd)
def wrapper(*args, **kwargs):
    """Returns the specified field of the command invocation.

    Parameters
    ----------
    args : list
            Positional parameters to pass to the wrapped callable
    kwargs : dict
            Named parameter to pass to the wrapped callable
    """
  res = cmd(*args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\utils.py:148:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding')
kwargs = {}

@functools.wraps(func)
def wrapper2(*args: ty.Any, **kwargs: ty.Any) -> R:
  result = func(*args, **kwargs)  # type: T

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client\base.py:136:


self = <ipfshttpclient.client.Client object at 0x03F4ACB8>
data = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
kwargs = {}, body = <generator object BytesFileStream.body at 0x04B64760>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="ba5d63b578524be9a34d40fc82343a16"'}

@utils.return_field('Hash')
@base.returns_single_item(dict)
def add_bytes(self, data, **kwargs):
    """Adds a set of bytes as a file to IPFS.

    .. code-block:: python

            >>> client.add_bytes(b"Mary had a little lamb")
            'QmZfF6C9j4VtoCsTp4KSrhYH47QMd3DNXVZBKaxJdhaPab'

    Also accepts and will stream generator objects.

    Parameters
    ----------
    data : bytes
            Content to be added as a file

    Returns
    -------
            str
                    Hash of the added IPFS object
    """
    body, headers = multipart.stream_bytes(data, chunk_size=self.chunk_size)
  return self._client.request('/add', decoder='json',
                                data=body, headers=headers, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client_init_.py:257:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, path = '/add', args = []

def request(
            self, path: str,
            args: ty.Sequence[str] = [], *,
            opts: ty.Mapping[str, str] = {},
            decoder: str = "none",
            stream: bool = False,
            offline: bool = False,
            return_result: bool = True,
            auth: auth_t = None,
            cookies: cookies_t = None,
            data: reqdata_sync_t = None,
            headers: headers_t = None,
            timeout: timeout_t = None
) -> ty.Optional[ty.Union[  # noqa: ET122 (checker bug)
    StreamDecodeIteratorSync[bytes],
    StreamDecodeIteratorSync[object],
    bytes,
    ty.List[object],
]]:
    """Sends an HTTP request to the IPFS daemon

    This function returns the contents of the HTTP response from the IPFS
    daemon.

    Raises
    ------
    ~ipfshttpclient.exceptions.ErrorResponse
    ~ipfshttpclient.exceptions.ConnectionError
    ~ipfshttpclient.exceptions.ProtocolError
    ~ipfshttpclient.exceptions.StatusError
    ~ipfshttpclient.exceptions.TimeoutError

    Parameters
    ----------
    path
            The command path relative to the given base
    decoder
            The encoder to use to parse the HTTP response
    stream
            Whether to return an iterable yielding the received items incrementally
            instead of receiving and decoding all items up-front before returning
            them
    args
            Positional parameters to be sent along with the HTTP request
    opts
            Query string paramters to be sent along with the HTTP request
    offline
            Whether to request to daemon to handle this request in “offline-mode”
    return_result
            Whether to decode the values received from the daemon
    auth
            Authentication data to send along with this request as
            ``(username, password)`` tuple
    cookies
            HTTP cookies to send along with each request to the API daemon
    data
            Iterable yielding data to stream from the client to the daemon
    headers
            Custom HTTP headers to pass send along with the request
    timeout
            How many seconds to wait for the server to send data
            before giving up

            Set this to :py:`math.inf` to disable timeouts entirely.
    """
    # Don't attempt to decode response or stream
    # (which would keep an iterator open that will then never be waited for)
    if not return_result:
            decoder = None

    # HTTP method must always be "POST" since go-IPFS 0.5
    method = "POST"
    if "use_http_head_for_no_result" in self.workarounds and not return_result:  # pragma: no cover
            method = "HEAD"

    parser = encoding.get_encoding(decoder if decoder else "none")
  closables, res = self._request(
            method, path, map_args_to_params(args, opts, offline=offline),
            auth=auth, data=data, headers=headers, timeout=timeout,
            chunk_size=None,
    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_common.py:564:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, method = 'POST', path = 'add', params = []

def _request(
            self, method: str, path: str, params: ty.Sequence[ty.Tuple[str, str]], *,
            auth: auth_t,
            data: reqdata_sync_t,
            headers: headers_t,
            timeout: timeout_t,
            chunk_size: ty.Optional[int]
) -> ty.Tuple[ty.List[Closable], ty.Iterator[bytes]]:
    # Ensure path is relative so that it is resolved relative to the base
    while path.startswith("/"):
            path = path[1:]

    url = urllib.parse.urljoin(self._base_url, path)

    try:
            # Determine session object to use
            closables, session = self._access_session()

            # Do HTTP request (synchronously) and map exceptions
            try:
                  res = session.request(
                            method=method,
                            url=url,
                            **map_args_to_requests(
                                    params=params,
                                    auth=auth,
                                    headers=headers,
                                    timeout=timeout,
                            ),
                            data=data,
                            stream=True,
                    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_requests.py:152:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04E00070>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', args = ()
kwargs = {'data': <generator object BytesFileStream.body at 0x04B64760>, 'headers': {'Content-Disposition': 'form-data; name="f...s"', 'Content-Type': 'multipart/form-data; boundary="ba5d63b578524be9a34d40fc82343a16"'}, 'params': {}, 'stream': True}
family = <AddressFamily.AF_UNSPEC: 0>

def request(self, method, url, *args, **kwargs):
    family = kwargs.pop("family", self.family)
    if family != socket.AF_UNSPEC:
            # Inject provided address family value as extension to scheme
            url = urllib.parse.urlparse(url)
            url = url._replace(scheme="{0}+{1}".format(url.scheme, AF2NAME[int(family)]))
            url = url.geturl()
  return super().request(method, url, *args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:219:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04E00070>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', params = {}, data = <generator object BytesFileStream.body at 0x04B64760>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="ba5d63b578524be9a34d40fc82343a16"'}
cookies = None, files = None, auth = None, timeout = None, allow_redirects = True, proxies = {}, hooks = None
stream = True, verify = None, cert = None, json = None

def request(self, method, url,
        params=None, data=None, headers=None, cookies=None, files=None,
        auth=None, timeout=None, allow_redirects=True, proxies=None,
        hooks=None, stream=None, verify=None, cert=None, json=None):
    """Constructs a :class:`Request <Request>`, prepares it and sends it.
    Returns :class:`Response <Response>` object.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query
        string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) json to send in the body of the
        :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the
        :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the
        :class:`Request`.
    :param files: (optional) Dictionary of ``'filename': file-like-objects``
        for multipart encoding upload.
    :param auth: (optional) Auth tuple or callable to enable
        Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Set to True by default.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol or protocol and
        hostname to the URL of the proxy.
    :param stream: (optional) whether to immediately download the response
        content. Defaults to ``False``.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
        the server's TLS certificate, or a string, in which case it must be a path
        to a CA bundle to use. Defaults to ``True``.
    :param cert: (optional) if String, path to ssl client cert file (.pem).
        If Tuple, ('cert', 'key') pair.
    :rtype: requests.Response
    """
    # Create the Request.
    req = Request(
        method=method.upper(),
        url=url,
        headers=headers,
        files=files,
        data=data or {},
        json=json,
        params=params or {},
        auth=auth,
        cookies=cookies,
        hooks=hooks,
    )
    prep = self.prepare_request(req)

    proxies = proxies or {}

    settings = self.merge_environment_settings(
        prep.url, proxies, stream, verify, cert
    )

    # Send the request.
    send_kwargs = {
        'timeout': timeout,
        'allow_redirects': allow_redirects,
    }
    send_kwargs.update(settings)
  resp = self.send(prep, **send_kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:530:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04E00070>, request = <PreparedRequest [POST]>
kwargs = {'cert': None, 'proxies': OrderedDict(), 'stream': True, 'timeout': None, ...}, allow_redirects = True
stream = True, hooks = {'response': []}, adapter = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x04E009D0>
start = 177.9868904

def send(self, request, **kwargs):
    """Send a given PreparedRequest.

    :rtype: requests.Response
    """
    # Set defaults that the hooks can utilize to ensure they always have
    # the correct parameters to reproduce the previous request.
    kwargs.setdefault('stream', self.stream)
    kwargs.setdefault('verify', self.verify)
    kwargs.setdefault('cert', self.cert)
    kwargs.setdefault('proxies', self.proxies)

    # It's possible that users might accidentally send a Request object.
    # Guard against that specific failure case.
    if isinstance(request, Request):
        raise ValueError('You can only send PreparedRequests.')

    # Set up variables needed for resolve_redirects and dispatching of hooks
    allow_redirects = kwargs.pop('allow_redirects', True)
    stream = kwargs.get('stream')
    hooks = request.hooks

    # Get the appropriate adapter to use
    adapter = self.get_adapter(url=request.url)

    # Start time (approximately) of the request
    start = preferred_clock()

    # Send the request
  r = adapter.send(request, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:643:


self = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x04E009D0>, request = <PreparedRequest [POST]>
stream = True, timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None
proxies = OrderedDict()

def send(self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None):
    """Sends PreparedRequest object. Returns Response object.

    :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
    :param stream: (optional) Whether to stream the request content.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple or urllib3 Timeout object
    :param verify: (optional) Either a boolean, in which case it controls whether
        we verify the server's TLS certificate, or a string, in which case it
        must be a path to a CA bundle to use
    :param cert: (optional) Any user-provided SSL certificate to be trusted.
    :param proxies: (optional) The proxies dictionary to apply to the request.
    :rtype: requests.Response
    """

    try:
        conn = self.get_connection(request.url, proxies)
    except LocationValueError as e:
        raise InvalidURL(e, request=request)

    self.cert_verify(conn, request.url, verify, cert)
    url = self.request_url(request, proxies)
    self.add_headers(request, stream=stream, timeout=timeout, verify=verify, cert=cert, proxies=proxies)

    chunked = not (request.body is None or 'Content-Length' in request.headers)

    if isinstance(timeout, tuple):
        try:
            connect, read = timeout
            timeout = TimeoutSauce(connect=connect, read=read)
        except ValueError as e:
            # this may raise a string formatting error.
            err = ("Invalid timeout {}. Pass a (connect, read) "
                   "timeout tuple, or a single float to set "
                   "both timeouts to the same value".format(timeout))
            raise ValueError(err)
    elif isinstance(timeout, TimeoutSauce):
        pass
    else:
        timeout = TimeoutSauce(connect=timeout, read=timeout)

    try:
        if not chunked:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout
            )

        # Send the request.
        else:
            if hasattr(conn, 'proxy_pool'):
                conn = conn.proxy_pool

            low_conn = conn._get_conn(timeout=DEFAULT_POOL_TIMEOUT)

            try:
                low_conn.putrequest(request.method,
                                    url,
                                    skip_accept_encoding=True)

                for header, value in request.headers.items():
                    low_conn.putheader(header, value)
              low_conn.endheaders()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\adapters.py:467:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04E00868>, message_body = None

def endheaders(self, message_body=None, *, encode_chunked=False):
    """Indicate that the last header line has been sent to the server.

    This method sends the request to the server.  The optional message_body
    argument can be used to pass a message body associated with the
    request.
    """
    if self.__state == _CS_REQ_STARTED:
        self.__state = _CS_REQ_SENT
    else:
        raise CannotSendHeader()
  self._send_output(message_body, encode_chunked=encode_chunked)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1235:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04E00868>, message_body = None
encode_chunked = False

def _send_output(self, message_body=None, encode_chunked=False):
    """Send the currently buffered request and clear the buffer.

    Appends an extra \\r\\n to the buffer.
    A message_body may be specified, to be appended to the request.
    """
    self._buffer.extend((b"", b""))
    msg = b"\r\n".join(self._buffer)
    del self._buffer[:]
  self.send(msg)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1006:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04E00868>
data = b'POST /api/v0/add?stream-channels=true HTTP/1.1\r\nHost: localhost:5001\r\nContent-Disposition: form-data; name="file...\nContent-Type: multipart/form-data; boundary="ba5d63b578524be9a34d40fc82343a16"\r\nTransfer-Encoding: chunked\r\n\r\n'

def send(self, data):
    """Send `data' to the server.
    ``data`` can be a string object, a bytes object, an array object, a
    file-like object that supports a .read() method, or an iterable object.
    """

    if self.sock is None:
        if self.auto_open:
          self.connect()

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:946:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04E00868>

def connect(self):
  conn = self._new_conn()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\urllib3\connection.py:187:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04E00868>

def _new_conn(self):
    extra_kw = {
            "family": self.family
    }
    if self.source_address:
            extra_kw['source_address'] = self.source_address

    if self.socket_options:
            extra_kw['socket_options'] = self.socket_options

    try:
            dns_host = getattr(self, "_dns_host", self.host)
            conn = create_connection(
                    (dns_host, self.port), self.timeout, **extra_kw)
    except socket.timeout:
            raise urllib3.exceptions.ConnectTimeoutError(
                    self, "Connection to %s timed out. (connect timeout=%s)" %
                    (self.host, self.timeout))
    except OSError as e:
          raise urllib3.exceptions.NewConnectionError(
                    self, "Failed to establish a new connection: %s" % e)

E urllib3.exceptions.NewConnectionError: <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04E00868>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:103: NewConnectionError

During handling of the above exception, another exception occurred:

warc = '1memento.warc', lookup = 'memento//?url=memento.us', status = 301, location = '/memento//memento.us'

@pytest.mark.parametrize("warc,lookup,status,location", [
    ('salam-home.warc', 'memento/*/cs.odu.edu/~salam/', 302,
     '/memento/20160305192247/cs.odu.edu/~salam/'),
    ('1memento.warc', 'memento/*/memento.us', 302,
     '/memento/20130202100000/memento.us/'),
    ('2mementos.warc', 'memento/*/memento.us', 200, None),
    ('salam-home.warc', 'memento/*/?url=cs.odu.edu/~salam/', 301,
     '/memento/*/cs.odu.edu/~salam/'),
    ('1memento.warc', 'memento/*/?url=memento.us', 301,
     '/memento/*/memento.us'),
    ('2mementos.warc', 'memento/*/?url=memento.us', 301,
     '/memento/*/memento.us'),
    ('2mementos_queryString.warc',
     '/memento/20130202100000/memento.us/' +
     'index.php?anotherval=ipsum&someval=lorem', 200, None),
])
def test_replay_search(warc, lookup, status, location):
  ipwbTest.startReplay(warc)

tests\test_replay.py:53:


tests\testUtil.py:69: in startReplay
cdxjList = indexer.indexFileAt(pathOfWARC, quiet=True)
ipwb\indexer.py:171: in indexFileAt
cdxjLines += getCDXJLinesFromFile(
ipwb\indexer.py:274: in getCDXJLinesFromFile
ipfsHashes = pushToIPFS(hstr, payload)


hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
payload = b'Memento for 2/2/2013 10:00am\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return

            httpHeaderIPFSHash = pushBytesToIPFS(hstr)
            payloadIPFSHash = pushBytesToIPFS(payload)

            if retryCount > 0:
                m = f'Retrying succeeded after {retryCount} attempts'
                print(m)
            return [httpHeaderIPFSHash, payloadIPFSHash]
        except NewConnectionError as e:
            print('IPFS daemon is likely not running.')
            print('Run "ipfs daemon" in another terminal session.')
          sys.exit()

E SystemExit

ipwb\indexer.py:91: SystemExit
------------------------------------------------ Captured stdout call -------------------------------------------------
Sample data not pulled from IPFS.
Check that the IPFS daemon is running.
IPWB replay started on http://localhost:5000

  • Serving Flask app "ipwb.replay" (lazy loading)
  • Environment: production
    WARNING: This is a development server. Do not use it in a production deployment.
    Use a production WSGI server instead.
  • Debug mode: off
    IPFS daemon is likely not running.
    Run "ipfs daemon" in another terminal session.
    ------------------------------------------------ Captured stderr call -------------------------------------------------
    Processing WARC records in 1memento.warc: 1/4
    _______________ test_replay_search[2mementos.warc-memento//?url=memento.us-301-/memento//memento.us] ________________

self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04C56160>

def _new_conn(self):
    extra_kw = {
            "family": self.family
    }
    if self.source_address:
            extra_kw['source_address'] = self.source_address

    if self.socket_options:
            extra_kw['socket_options'] = self.socket_options

    try:
            dns_host = getattr(self, "_dns_host", self.host)
          conn = create_connection(
                    (dns_host, self.port), self.timeout, **extra_kw)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:96:


address = ('localhost', 5001), timeout = <object object at 0x00986BF0>, source_address = None
socket_options = [(6, 1, 1)], family = <AddressFamily.AF_UNSPEC: 0>

def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
                      source_address=None, socket_options=None,
                      family=socket.AF_UNSPEC):
    host, port = address
    if host.startswith('['):
            host = host.strip('[]')
    err = None

    if not family or family == socket.AF_UNSPEC:
            family = urllib3.util.connection.allowed_gai_family()

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                    sock = socket.socket(af, socktype, proto)

                    # If provided, set socket level options before connecting.
                    if socket_options is not None:
                            for opt in socket_options:
                                    sock.setsockopt(*opt)

                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                    sock.connect(sa)
                    return sock
            except OSError as e:
                    err = e
                    if sock is not None:
                            sock.close()
                            sock = None

    if err is not None:
          raise err

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:66:


address = ('localhost', 5001), timeout = <object object at 0x00986BF0>, source_address = None
socket_options = [(6, 1, 1)], family = <AddressFamily.AF_UNSPEC: 0>

def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
                      source_address=None, socket_options=None,
                      family=socket.AF_UNSPEC):
    host, port = address
    if host.startswith('['):
            host = host.strip('[]')
    err = None

    if not family or family == socket.AF_UNSPEC:
            family = urllib3.util.connection.allowed_gai_family()

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                    sock = socket.socket(af, socktype, proto)

                    # If provided, set socket level options before connecting.
                    if socket_options is not None:
                            for opt in socket_options:
                                    sock.setsockopt(*opt)

                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                  sock.connect(sa)

E ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:57: ConnectionRefusedError

During handling of the above exception, another exception occurred:

hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
payload = b'Memento for 2/2/2013 10:00am\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return
          httpHeaderIPFSHash = pushBytesToIPFS(hstr)

ipwb\indexer.py:80:


bytes = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'

def pushBytesToIPFS(bytes):
    """
    Call the IPFS API to add the byte string to IPFS.
    When IPFS returns a hash, return this to the caller
    """
    global IPFS_API

    # Returns unicode in py2.7, str in py3.7
    try:
      res = IPFS_API.add_bytes(bytes)  # bytes)

ipwb\indexer.py:383:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding')
kwargs = {}

@wraps(cmd)
def wrapper(*args, **kwargs):
    """Returns the specified field of the command invocation.

    Parameters
    ----------
    args : list
            Positional parameters to pass to the wrapped callable
    kwargs : dict
            Named parameter to pass to the wrapped callable
    """
  res = cmd(*args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\utils.py:148:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding')
kwargs = {}

@functools.wraps(func)
def wrapper2(*args: ty.Any, **kwargs: ty.Any) -> R:
  result = func(*args, **kwargs)  # type: T

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client\base.py:136:


self = <ipfshttpclient.client.Client object at 0x03F4ACB8>
data = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
kwargs = {}, body = <generator object BytesFileStream.body at 0x04BEA108>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="1a03967d40f84117be0d1f1615b1b30a"'}

@utils.return_field('Hash')
@base.returns_single_item(dict)
def add_bytes(self, data, **kwargs):
    """Adds a set of bytes as a file to IPFS.

    .. code-block:: python

            >>> client.add_bytes(b"Mary had a little lamb")
            'QmZfF6C9j4VtoCsTp4KSrhYH47QMd3DNXVZBKaxJdhaPab'

    Also accepts and will stream generator objects.

    Parameters
    ----------
    data : bytes
            Content to be added as a file

    Returns
    -------
            str
                    Hash of the added IPFS object
    """
    body, headers = multipart.stream_bytes(data, chunk_size=self.chunk_size)
  return self._client.request('/add', decoder='json',
                                data=body, headers=headers, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client_init_.py:257:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, path = '/add', args = []

def request(
            self, path: str,
            args: ty.Sequence[str] = [], *,
            opts: ty.Mapping[str, str] = {},
            decoder: str = "none",
            stream: bool = False,
            offline: bool = False,
            return_result: bool = True,
            auth: auth_t = None,
            cookies: cookies_t = None,
            data: reqdata_sync_t = None,
            headers: headers_t = None,
            timeout: timeout_t = None
) -> ty.Optional[ty.Union[  # noqa: ET122 (checker bug)
    StreamDecodeIteratorSync[bytes],
    StreamDecodeIteratorSync[object],
    bytes,
    ty.List[object],
]]:
    """Sends an HTTP request to the IPFS daemon

    This function returns the contents of the HTTP response from the IPFS
    daemon.

    Raises
    ------
    ~ipfshttpclient.exceptions.ErrorResponse
    ~ipfshttpclient.exceptions.ConnectionError
    ~ipfshttpclient.exceptions.ProtocolError
    ~ipfshttpclient.exceptions.StatusError
    ~ipfshttpclient.exceptions.TimeoutError

    Parameters
    ----------
    path
            The command path relative to the given base
    decoder
            The encoder to use to parse the HTTP response
    stream
            Whether to return an iterable yielding the received items incrementally
            instead of receiving and decoding all items up-front before returning
            them
    args
            Positional parameters to be sent along with the HTTP request
    opts
            Query string paramters to be sent along with the HTTP request
    offline
            Whether to request to daemon to handle this request in “offline-mode”
    return_result
            Whether to decode the values received from the daemon
    auth
            Authentication data to send along with this request as
            ``(username, password)`` tuple
    cookies
            HTTP cookies to send along with each request to the API daemon
    data
            Iterable yielding data to stream from the client to the daemon
    headers
            Custom HTTP headers to pass send along with the request
    timeout
            How many seconds to wait for the server to send data
            before giving up

            Set this to :py:`math.inf` to disable timeouts entirely.
    """
    # Don't attempt to decode response or stream
    # (which would keep an iterator open that will then never be waited for)
    if not return_result:
            decoder = None

    # HTTP method must always be "POST" since go-IPFS 0.5
    method = "POST"
    if "use_http_head_for_no_result" in self.workarounds and not return_result:  # pragma: no cover
            method = "HEAD"

    parser = encoding.get_encoding(decoder if decoder else "none")
  closables, res = self._request(
            method, path, map_args_to_params(args, opts, offline=offline),
            auth=auth, data=data, headers=headers, timeout=timeout,
            chunk_size=None,
    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_common.py:564:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, method = 'POST', path = 'add', params = []

def _request(
            self, method: str, path: str, params: ty.Sequence[ty.Tuple[str, str]], *,
            auth: auth_t,
            data: reqdata_sync_t,
            headers: headers_t,
            timeout: timeout_t,
            chunk_size: ty.Optional[int]
) -> ty.Tuple[ty.List[Closable], ty.Iterator[bytes]]:
    # Ensure path is relative so that it is resolved relative to the base
    while path.startswith("/"):
            path = path[1:]

    url = urllib.parse.urljoin(self._base_url, path)

    try:
            # Determine session object to use
            closables, session = self._access_session()

            # Do HTTP request (synchronously) and map exceptions
            try:
                  res = session.request(
                            method=method,
                            url=url,
                            **map_args_to_requests(
                                    params=params,
                                    auth=auth,
                                    headers=headers,
                                    timeout=timeout,
                            ),
                            data=data,
                            stream=True,
                    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_requests.py:152:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04A6E3E8>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', args = ()
kwargs = {'data': <generator object BytesFileStream.body at 0x04BEA108>, 'headers': {'Content-Disposition': 'form-data; name="f...s"', 'Content-Type': 'multipart/form-data; boundary="1a03967d40f84117be0d1f1615b1b30a"'}, 'params': {}, 'stream': True}
family = <AddressFamily.AF_UNSPEC: 0>

def request(self, method, url, *args, **kwargs):
    family = kwargs.pop("family", self.family)
    if family != socket.AF_UNSPEC:
            # Inject provided address family value as extension to scheme
            url = urllib.parse.urlparse(url)
            url = url._replace(scheme="{0}+{1}".format(url.scheme, AF2NAME[int(family)]))
            url = url.geturl()
  return super().request(method, url, *args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:219:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04A6E3E8>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', params = {}, data = <generator object BytesFileStream.body at 0x04BEA108>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="1a03967d40f84117be0d1f1615b1b30a"'}
cookies = None, files = None, auth = None, timeout = None, allow_redirects = True, proxies = {}, hooks = None
stream = True, verify = None, cert = None, json = None

def request(self, method, url,
        params=None, data=None, headers=None, cookies=None, files=None,
        auth=None, timeout=None, allow_redirects=True, proxies=None,
        hooks=None, stream=None, verify=None, cert=None, json=None):
    """Constructs a :class:`Request <Request>`, prepares it and sends it.
    Returns :class:`Response <Response>` object.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query
        string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) json to send in the body of the
        :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the
        :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the
        :class:`Request`.
    :param files: (optional) Dictionary of ``'filename': file-like-objects``
        for multipart encoding upload.
    :param auth: (optional) Auth tuple or callable to enable
        Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Set to True by default.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol or protocol and
        hostname to the URL of the proxy.
    :param stream: (optional) whether to immediately download the response
        content. Defaults to ``False``.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
        the server's TLS certificate, or a string, in which case it must be a path
        to a CA bundle to use. Defaults to ``True``.
    :param cert: (optional) if String, path to ssl client cert file (.pem).
        If Tuple, ('cert', 'key') pair.
    :rtype: requests.Response
    """
    # Create the Request.
    req = Request(
        method=method.upper(),
        url=url,
        headers=headers,
        files=files,
        data=data or {},
        json=json,
        params=params or {},
        auth=auth,
        cookies=cookies,
        hooks=hooks,
    )
    prep = self.prepare_request(req)

    proxies = proxies or {}

    settings = self.merge_environment_settings(
        prep.url, proxies, stream, verify, cert
    )

    # Send the request.
    send_kwargs = {
        'timeout': timeout,
        'allow_redirects': allow_redirects,
    }
    send_kwargs.update(settings)
  resp = self.send(prep, **send_kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:530:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04A6E3E8>, request = <PreparedRequest [POST]>
kwargs = {'cert': None, 'proxies': OrderedDict(), 'stream': True, 'timeout': None, ...}, allow_redirects = True
stream = True, hooks = {'response': []}, adapter = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x04A6E400>
start = 187.4754634

def send(self, request, **kwargs):
    """Send a given PreparedRequest.

    :rtype: requests.Response
    """
    # Set defaults that the hooks can utilize to ensure they always have
    # the correct parameters to reproduce the previous request.
    kwargs.setdefault('stream', self.stream)
    kwargs.setdefault('verify', self.verify)
    kwargs.setdefault('cert', self.cert)
    kwargs.setdefault('proxies', self.proxies)

    # It's possible that users might accidentally send a Request object.
    # Guard against that specific failure case.
    if isinstance(request, Request):
        raise ValueError('You can only send PreparedRequests.')

    # Set up variables needed for resolve_redirects and dispatching of hooks
    allow_redirects = kwargs.pop('allow_redirects', True)
    stream = kwargs.get('stream')
    hooks = request.hooks

    # Get the appropriate adapter to use
    adapter = self.get_adapter(url=request.url)

    # Start time (approximately) of the request
    start = preferred_clock()

    # Send the request
  r = adapter.send(request, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:643:


self = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x04A6E400>, request = <PreparedRequest [POST]>
stream = True, timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None
proxies = OrderedDict()

def send(self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None):
    """Sends PreparedRequest object. Returns Response object.

    :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
    :param stream: (optional) Whether to stream the request content.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple or urllib3 Timeout object
    :param verify: (optional) Either a boolean, in which case it controls whether
        we verify the server's TLS certificate, or a string, in which case it
        must be a path to a CA bundle to use
    :param cert: (optional) Any user-provided SSL certificate to be trusted.
    :param proxies: (optional) The proxies dictionary to apply to the request.
    :rtype: requests.Response
    """

    try:
        conn = self.get_connection(request.url, proxies)
    except LocationValueError as e:
        raise InvalidURL(e, request=request)

    self.cert_verify(conn, request.url, verify, cert)
    url = self.request_url(request, proxies)
    self.add_headers(request, stream=stream, timeout=timeout, verify=verify, cert=cert, proxies=proxies)

    chunked = not (request.body is None or 'Content-Length' in request.headers)

    if isinstance(timeout, tuple):
        try:
            connect, read = timeout
            timeout = TimeoutSauce(connect=connect, read=read)
        except ValueError as e:
            # this may raise a string formatting error.
            err = ("Invalid timeout {}. Pass a (connect, read) "
                   "timeout tuple, or a single float to set "
                   "both timeouts to the same value".format(timeout))
            raise ValueError(err)
    elif isinstance(timeout, TimeoutSauce):
        pass
    else:
        timeout = TimeoutSauce(connect=timeout, read=timeout)

    try:
        if not chunked:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout
            )

        # Send the request.
        else:
            if hasattr(conn, 'proxy_pool'):
                conn = conn.proxy_pool

            low_conn = conn._get_conn(timeout=DEFAULT_POOL_TIMEOUT)

            try:
                low_conn.putrequest(request.method,
                                    url,
                                    skip_accept_encoding=True)

                for header, value in request.headers.items():
                    low_conn.putheader(header, value)
              low_conn.endheaders()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\adapters.py:467:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04C56160>, message_body = None

def endheaders(self, message_body=None, *, encode_chunked=False):
    """Indicate that the last header line has been sent to the server.

    This method sends the request to the server.  The optional message_body
    argument can be used to pass a message body associated with the
    request.
    """
    if self.__state == _CS_REQ_STARTED:
        self.__state = _CS_REQ_SENT
    else:
        raise CannotSendHeader()
  self._send_output(message_body, encode_chunked=encode_chunked)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1235:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04C56160>, message_body = None
encode_chunked = False

def _send_output(self, message_body=None, encode_chunked=False):
    """Send the currently buffered request and clear the buffer.

    Appends an extra \\r\\n to the buffer.
    A message_body may be specified, to be appended to the request.
    """
    self._buffer.extend((b"", b""))
    msg = b"\r\n".join(self._buffer)
    del self._buffer[:]
  self.send(msg)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1006:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04C56160>
data = b'POST /api/v0/add?stream-channels=true HTTP/1.1\r\nHost: localhost:5001\r\nContent-Disposition: form-data; name="file...\nContent-Type: multipart/form-data; boundary="1a03967d40f84117be0d1f1615b1b30a"\r\nTransfer-Encoding: chunked\r\n\r\n'

def send(self, data):
    """Send `data' to the server.
    ``data`` can be a string object, a bytes object, an array object, a
    file-like object that supports a .read() method, or an iterable object.
    """

    if self.sock is None:
        if self.auto_open:
          self.connect()

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:946:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04C56160>

def connect(self):
  conn = self._new_conn()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\urllib3\connection.py:187:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04C56160>

def _new_conn(self):
    extra_kw = {
            "family": self.family
    }
    if self.source_address:
            extra_kw['source_address'] = self.source_address

    if self.socket_options:
            extra_kw['socket_options'] = self.socket_options

    try:
            dns_host = getattr(self, "_dns_host", self.host)
            conn = create_connection(
                    (dns_host, self.port), self.timeout, **extra_kw)
    except socket.timeout:
            raise urllib3.exceptions.ConnectTimeoutError(
                    self, "Connection to %s timed out. (connect timeout=%s)" %
                    (self.host, self.timeout))
    except OSError as e:
          raise urllib3.exceptions.NewConnectionError(
                    self, "Failed to establish a new connection: %s" % e)

E urllib3.exceptions.NewConnectionError: <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04C56160>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:103: NewConnectionError

During handling of the above exception, another exception occurred:

warc = '2mementos.warc', lookup = 'memento//?url=memento.us', status = 301, location = '/memento//memento.us'

@pytest.mark.parametrize("warc,lookup,status,location", [
    ('salam-home.warc', 'memento/*/cs.odu.edu/~salam/', 302,
     '/memento/20160305192247/cs.odu.edu/~salam/'),
    ('1memento.warc', 'memento/*/memento.us', 302,
     '/memento/20130202100000/memento.us/'),
    ('2mementos.warc', 'memento/*/memento.us', 200, None),
    ('salam-home.warc', 'memento/*/?url=cs.odu.edu/~salam/', 301,
     '/memento/*/cs.odu.edu/~salam/'),
    ('1memento.warc', 'memento/*/?url=memento.us', 301,
     '/memento/*/memento.us'),
    ('2mementos.warc', 'memento/*/?url=memento.us', 301,
     '/memento/*/memento.us'),
    ('2mementos_queryString.warc',
     '/memento/20130202100000/memento.us/' +
     'index.php?anotherval=ipsum&someval=lorem', 200, None),
])
def test_replay_search(warc, lookup, status, location):
  ipwbTest.startReplay(warc)

tests\test_replay.py:53:


tests\testUtil.py:69: in startReplay
cdxjList = indexer.indexFileAt(pathOfWARC, quiet=True)
ipwb\indexer.py:171: in indexFileAt
cdxjLines += getCDXJLinesFromFile(
ipwb\indexer.py:274: in getCDXJLinesFromFile
ipfsHashes = pushToIPFS(hstr, payload)


hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
payload = b'Memento for 2/2/2013 10:00am\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return

            httpHeaderIPFSHash = pushBytesToIPFS(hstr)
            payloadIPFSHash = pushBytesToIPFS(payload)

            if retryCount > 0:
                m = f'Retrying succeeded after {retryCount} attempts'
                print(m)
            return [httpHeaderIPFSHash, payloadIPFSHash]
        except NewConnectionError as e:
            print('IPFS daemon is likely not running.')
            print('Run "ipfs daemon" in another terminal session.')
          sys.exit()

E SystemExit

ipwb\indexer.py:91: SystemExit
------------------------------------------------ Captured stdout call -------------------------------------------------
Sample data not pulled from IPFS.
Check that the IPFS daemon is running.
IPWB replay started on http://localhost:5000

  • Serving Flask app "ipwb.replay" (lazy loading)
  • Environment: production
    WARNING: This is a development server. Do not use it in a production deployment.
    Use a production WSGI server instead.
  • Debug mode: off
    IPFS daemon is likely not running.
    Run "ipfs daemon" in another terminal session.
    ------------------------------------------------ Captured stderr call -------------------------------------------------
    Processing WARC records in 2mementos.warc: 1/5
    _ test_replay_search[2mementos_queryString.warc-/memento/20130202100000/memento.us/index.php?anotherval=ipsum&someval=lorem-200-None] _

self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04D1E9D0>

def _new_conn(self):
    extra_kw = {
            "family": self.family
    }
    if self.source_address:
            extra_kw['source_address'] = self.source_address

    if self.socket_options:
            extra_kw['socket_options'] = self.socket_options

    try:
            dns_host = getattr(self, "_dns_host", self.host)
          conn = create_connection(
                    (dns_host, self.port), self.timeout, **extra_kw)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:96:


address = ('localhost', 5001), timeout = <object object at 0x00986BF0>, source_address = None
socket_options = [(6, 1, 1)], family = <AddressFamily.AF_UNSPEC: 0>

def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
                      source_address=None, socket_options=None,
                      family=socket.AF_UNSPEC):
    host, port = address
    if host.startswith('['):
            host = host.strip('[]')
    err = None

    if not family or family == socket.AF_UNSPEC:
            family = urllib3.util.connection.allowed_gai_family()

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                    sock = socket.socket(af, socktype, proto)

                    # If provided, set socket level options before connecting.
                    if socket_options is not None:
                            for opt in socket_options:
                                    sock.setsockopt(*opt)

                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                    sock.connect(sa)
                    return sock
            except OSError as e:
                    err = e
                    if sock is not None:
                            sock.close()
                            sock = None

    if err is not None:
          raise err

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:66:


address = ('localhost', 5001), timeout = <object object at 0x00986BF0>, source_address = None
socket_options = [(6, 1, 1)], family = <AddressFamily.AF_UNSPEC: 0>

def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
                      source_address=None, socket_options=None,
                      family=socket.AF_UNSPEC):
    host, port = address
    if host.startswith('['):
            host = host.strip('[]')
    err = None

    if not family or family == socket.AF_UNSPEC:
            family = urllib3.util.connection.allowed_gai_family()

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                    sock = socket.socket(af, socktype, proto)

                    # If provided, set socket level options before connecting.
                    if socket_options is not None:
                            for opt in socket_options:
                                    sock.setsockopt(*opt)

                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                  sock.connect(sa)

E ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:57: ConnectionRefusedError

During handling of the above exception, another exception occurred:

hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
payload = b'Memento for 2/2/2013 10:00am\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return
          httpHeaderIPFSHash = pushBytesToIPFS(hstr)

ipwb\indexer.py:80:


bytes = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'

def pushBytesToIPFS(bytes):
    """
    Call the IPFS API to add the byte string to IPFS.
    When IPFS returns a hash, return this to the caller
    """
    global IPFS_API

    # Returns unicode in py2.7, str in py3.7
    try:
      res = IPFS_API.add_bytes(bytes)  # bytes)

ipwb\indexer.py:383:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding')
kwargs = {}

@wraps(cmd)
def wrapper(*args, **kwargs):
    """Returns the specified field of the command invocation.

    Parameters
    ----------
    args : list
            Positional parameters to pass to the wrapped callable
    kwargs : dict
            Named parameter to pass to the wrapped callable
    """
  res = cmd(*args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\utils.py:148:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding')
kwargs = {}

@functools.wraps(func)
def wrapper2(*args: ty.Any, **kwargs: ty.Any) -> R:
  result = func(*args, **kwargs)  # type: T

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client\base.py:136:


self = <ipfshttpclient.client.Client object at 0x03F4ACB8>
data = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
kwargs = {}, body = <generator object BytesFileStream.body at 0x04D63F78>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="9c7b459639c44ed0a77d4acf38b5b647"'}

@utils.return_field('Hash')
@base.returns_single_item(dict)
def add_bytes(self, data, **kwargs):
    """Adds a set of bytes as a file to IPFS.

    .. code-block:: python

            >>> client.add_bytes(b"Mary had a little lamb")
            'QmZfF6C9j4VtoCsTp4KSrhYH47QMd3DNXVZBKaxJdhaPab'

    Also accepts and will stream generator objects.

    Parameters
    ----------
    data : bytes
            Content to be added as a file

    Returns
    -------
            str
                    Hash of the added IPFS object
    """
    body, headers = multipart.stream_bytes(data, chunk_size=self.chunk_size)
  return self._client.request('/add', decoder='json',
                                data=body, headers=headers, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client_init_.py:257:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, path = '/add', args = []

def request(
            self, path: str,
            args: ty.Sequence[str] = [], *,
            opts: ty.Mapping[str, str] = {},
            decoder: str = "none",
            stream: bool = False,
            offline: bool = False,
            return_result: bool = True,
            auth: auth_t = None,
            cookies: cookies_t = None,
            data: reqdata_sync_t = None,
            headers: headers_t = None,
            timeout: timeout_t = None
) -> ty.Optional[ty.Union[  # noqa: ET122 (checker bug)
    StreamDecodeIteratorSync[bytes],
    StreamDecodeIteratorSync[object],
    bytes,
    ty.List[object],
]]:
    """Sends an HTTP request to the IPFS daemon

    This function returns the contents of the HTTP response from the IPFS
    daemon.

    Raises
    ------
    ~ipfshttpclient.exceptions.ErrorResponse
    ~ipfshttpclient.exceptions.ConnectionError
    ~ipfshttpclient.exceptions.ProtocolError
    ~ipfshttpclient.exceptions.StatusError
    ~ipfshttpclient.exceptions.TimeoutError

    Parameters
    ----------
    path
            The command path relative to the given base
    decoder
            The encoder to use to parse the HTTP response
    stream
            Whether to return an iterable yielding the received items incrementally
            instead of receiving and decoding all items up-front before returning
            them
    args
            Positional parameters to be sent along with the HTTP request
    opts
            Query string paramters to be sent along with the HTTP request
    offline
            Whether to request to daemon to handle this request in “offline-mode”
    return_result
            Whether to decode the values received from the daemon
    auth
            Authentication data to send along with this request as
            ``(username, password)`` tuple
    cookies
            HTTP cookies to send along with each request to the API daemon
    data
            Iterable yielding data to stream from the client to the daemon
    headers
            Custom HTTP headers to pass send along with the request
    timeout
            How many seconds to wait for the server to send data
            before giving up

            Set this to :py:`math.inf` to disable timeouts entirely.
    """
    # Don't attempt to decode response or stream
    # (which would keep an iterator open that will then never be waited for)
    if not return_result:
            decoder = None

    # HTTP method must always be "POST" since go-IPFS 0.5
    method = "POST"
    if "use_http_head_for_no_result" in self.workarounds and not return_result:  # pragma: no cover
            method = "HEAD"

    parser = encoding.get_encoding(decoder if decoder else "none")
  closables, res = self._request(
            method, path, map_args_to_params(args, opts, offline=offline),
            auth=auth, data=data, headers=headers, timeout=timeout,
            chunk_size=None,
    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_common.py:564:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, method = 'POST', path = 'add', params = []

def _request(
            self, method: str, path: str, params: ty.Sequence[ty.Tuple[str, str]], *,
            auth: auth_t,
            data: reqdata_sync_t,
            headers: headers_t,
            timeout: timeout_t,
            chunk_size: ty.Optional[int]
) -> ty.Tuple[ty.List[Closable], ty.Iterator[bytes]]:
    # Ensure path is relative so that it is resolved relative to the base
    while path.startswith("/"):
            path = path[1:]

    url = urllib.parse.urljoin(self._base_url, path)

    try:
            # Determine session object to use
            closables, session = self._access_session()

            # Do HTTP request (synchronously) and map exceptions
            try:
                  res = session.request(
                            method=method,
                            url=url,
                            **map_args_to_requests(
                                    params=params,
                                    auth=auth,
                                    headers=headers,
                                    timeout=timeout,
                            ),
                            data=data,
                            stream=True,
                    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_requests.py:152:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04D1E6E8>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', args = ()
kwargs = {'data': <generator object BytesFileStream.body at 0x04D63F78>, 'headers': {'Content-Disposition': 'form-data; name="f...s"', 'Content-Type': 'multipart/form-data; boundary="9c7b459639c44ed0a77d4acf38b5b647"'}, 'params': {}, 'stream': True}
family = <AddressFamily.AF_UNSPEC: 0>

def request(self, method, url, *args, **kwargs):
    family = kwargs.pop("family", self.family)
    if family != socket.AF_UNSPEC:
            # Inject provided address family value as extension to scheme
            url = urllib.parse.urlparse(url)
            url = url._replace(scheme="{0}+{1}".format(url.scheme, AF2NAME[int(family)]))
            url = url.geturl()
  return super().request(method, url, *args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:219:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04D1E6E8>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', params = {}, data = <generator object BytesFileStream.body at 0x04D63F78>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="9c7b459639c44ed0a77d4acf38b5b647"'}
cookies = None, files = None, auth = None, timeout = None, allow_redirects = True, proxies = {}, hooks = None
stream = True, verify = None, cert = None, json = None

def request(self, method, url,
        params=None, data=None, headers=None, cookies=None, files=None,
        auth=None, timeout=None, allow_redirects=True, proxies=None,
        hooks=None, stream=None, verify=None, cert=None, json=None):
    """Constructs a :class:`Request <Request>`, prepares it and sends it.
    Returns :class:`Response <Response>` object.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query
        string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) json to send in the body of the
        :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the
        :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the
        :class:`Request`.
    :param files: (optional) Dictionary of ``'filename': file-like-objects``
        for multipart encoding upload.
    :param auth: (optional) Auth tuple or callable to enable
        Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Set to True by default.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol or protocol and
        hostname to the URL of the proxy.
    :param stream: (optional) whether to immediately download the response
        content. Defaults to ``False``.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
        the server's TLS certificate, or a string, in which case it must be a path
        to a CA bundle to use. Defaults to ``True``.
    :param cert: (optional) if String, path to ssl client cert file (.pem).
        If Tuple, ('cert', 'key') pair.
    :rtype: requests.Response
    """
    # Create the Request.
    req = Request(
        method=method.upper(),
        url=url,
        headers=headers,
        files=files,
        data=data or {},
        json=json,
        params=params or {},
        auth=auth,
        cookies=cookies,
        hooks=hooks,
    )
    prep = self.prepare_request(req)

    proxies = proxies or {}

    settings = self.merge_environment_settings(
        prep.url, proxies, stream, verify, cert
    )

    # Send the request.
    send_kwargs = {
        'timeout': timeout,
        'allow_redirects': allow_redirects,
    }
    send_kwargs.update(settings)
  resp = self.send(prep, **send_kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:530:


self = <ipfshttpclient.requests_wrapper.Session object at 0x04D1E6E8>, request = <PreparedRequest [POST]>
kwargs = {'cert': None, 'proxies': OrderedDict(), 'stream': True, 'timeout': None, ...}, allow_redirects = True
stream = True, hooks = {'response': []}, adapter = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x04D1E3A0>
start = 196.9223957

def send(self, request, **kwargs):
    """Send a given PreparedRequest.

    :rtype: requests.Response
    """
    # Set defaults that the hooks can utilize to ensure they always have
    # the correct parameters to reproduce the previous request.
    kwargs.setdefault('stream', self.stream)
    kwargs.setdefault('verify', self.verify)
    kwargs.setdefault('cert', self.cert)
    kwargs.setdefault('proxies', self.proxies)

    # It's possible that users might accidentally send a Request object.
    # Guard against that specific failure case.
    if isinstance(request, Request):
        raise ValueError('You can only send PreparedRequests.')

    # Set up variables needed for resolve_redirects and dispatching of hooks
    allow_redirects = kwargs.pop('allow_redirects', True)
    stream = kwargs.get('stream')
    hooks = request.hooks

    # Get the appropriate adapter to use
    adapter = self.get_adapter(url=request.url)

    # Start time (approximately) of the request
    start = preferred_clock()

    # Send the request
  r = adapter.send(request, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:643:


self = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x04D1E3A0>, request = <PreparedRequest [POST]>
stream = True, timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None
proxies = OrderedDict()

def send(self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None):
    """Sends PreparedRequest object. Returns Response object.

    :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
    :param stream: (optional) Whether to stream the request content.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple or urllib3 Timeout object
    :param verify: (optional) Either a boolean, in which case it controls whether
        we verify the server's TLS certificate, or a string, in which case it
        must be a path to a CA bundle to use
    :param cert: (optional) Any user-provided SSL certificate to be trusted.
    :param proxies: (optional) The proxies dictionary to apply to the request.
    :rtype: requests.Response
    """

    try:
        conn = self.get_connection(request.url, proxies)
    except LocationValueError as e:
        raise InvalidURL(e, request=request)

    self.cert_verify(conn, request.url, verify, cert)
    url = self.request_url(request, proxies)
    self.add_headers(request, stream=stream, timeout=timeout, verify=verify, cert=cert, proxies=proxies)

    chunked = not (request.body is None or 'Content-Length' in request.headers)

    if isinstance(timeout, tuple):
        try:
            connect, read = timeout
            timeout = TimeoutSauce(connect=connect, read=read)
        except ValueError as e:
            # this may raise a string formatting error.
            err = ("Invalid timeout {}. Pass a (connect, read) "
                   "timeout tuple, or a single float to set "
                   "both timeouts to the same value".format(timeout))
            raise ValueError(err)
    elif isinstance(timeout, TimeoutSauce):
        pass
    else:
        timeout = TimeoutSauce(connect=timeout, read=timeout)

    try:
        if not chunked:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout
            )

        # Send the request.
        else:
            if hasattr(conn, 'proxy_pool'):
                conn = conn.proxy_pool

            low_conn = conn._get_conn(timeout=DEFAULT_POOL_TIMEOUT)

            try:
                low_conn.putrequest(request.method,
                                    url,
                                    skip_accept_encoding=True)

                for header, value in request.headers.items():
                    low_conn.putheader(header, value)
              low_conn.endheaders()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\adapters.py:467:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04D1E9D0>, message_body = None

def endheaders(self, message_body=None, *, encode_chunked=False):
    """Indicate that the last header line has been sent to the server.

    This method sends the request to the server.  The optional message_body
    argument can be used to pass a message body associated with the
    request.
    """
    if self.__state == _CS_REQ_STARTED:
        self.__state = _CS_REQ_SENT
    else:
        raise CannotSendHeader()
  self._send_output(message_body, encode_chunked=encode_chunked)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1235:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04D1E9D0>, message_body = None
encode_chunked = False

def _send_output(self, message_body=None, encode_chunked=False):
    """Send the currently buffered request and clear the buffer.

    Appends an extra \\r\\n to the buffer.
    A message_body may be specified, to be appended to the request.
    """
    self._buffer.extend((b"", b""))
    msg = b"\r\n".join(self._buffer)
    del self._buffer[:]
  self.send(msg)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1006:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04D1E9D0>
data = b'POST /api/v0/add?stream-channels=true HTTP/1.1\r\nHost: localhost:5001\r\nContent-Disposition: form-data; name="file...\nContent-Type: multipart/form-data; boundary="9c7b459639c44ed0a77d4acf38b5b647"\r\nTransfer-Encoding: chunked\r\n\r\n'

def send(self, data):
    """Send `data' to the server.
    ``data`` can be a string object, a bytes object, an array object, a
    file-like object that supports a .read() method, or an iterable object.
    """

    if self.sock is None:
        if self.auto_open:
          self.connect()

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:946:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04D1E9D0>

def connect(self):
  conn = self._new_conn()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\urllib3\connection.py:187:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04D1E9D0>

def _new_conn(self):
    extra_kw = {
            "family": self.family
    }
    if self.source_address:
            extra_kw['source_address'] = self.source_address

    if self.socket_options:
            extra_kw['socket_options'] = self.socket_options

    try:
            dns_host = getattr(self, "_dns_host", self.host)
            conn = create_connection(
                    (dns_host, self.port), self.timeout, **extra_kw)
    except socket.timeout:
            raise urllib3.exceptions.ConnectTimeoutError(
                    self, "Connection to %s timed out. (connect timeout=%s)" %
                    (self.host, self.timeout))
    except OSError as e:
          raise urllib3.exceptions.NewConnectionError(
                    self, "Failed to establish a new connection: %s" % e)

E urllib3.exceptions.NewConnectionError: <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x04D1E9D0>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:103: NewConnectionError

During handling of the above exception, another exception occurred:

warc = '2mementos_queryString.warc'
lookup = '/memento/20130202100000/memento.us/index.php?anotherval=ipsum&someval=lorem', status = 200, location = None

@pytest.mark.parametrize("warc,lookup,status,location", [
    ('salam-home.warc', 'memento/*/cs.odu.edu/~salam/', 302,
     '/memento/20160305192247/cs.odu.edu/~salam/'),
    ('1memento.warc', 'memento/*/memento.us', 302,
     '/memento/20130202100000/memento.us/'),
    ('2mementos.warc', 'memento/*/memento.us', 200, None),
    ('salam-home.warc', 'memento/*/?url=cs.odu.edu/~salam/', 301,
     '/memento/*/cs.odu.edu/~salam/'),
    ('1memento.warc', 'memento/*/?url=memento.us', 301,
     '/memento/*/memento.us'),
    ('2mementos.warc', 'memento/*/?url=memento.us', 301,
     '/memento/*/memento.us'),
    ('2mementos_queryString.warc',
     '/memento/20130202100000/memento.us/' +
     'index.php?anotherval=ipsum&someval=lorem', 200, None),
])
def test_replay_search(warc, lookup, status, location):
  ipwbTest.startReplay(warc)

tests\test_replay.py:53:


tests\testUtil.py:69: in startReplay
cdxjList = indexer.indexFileAt(pathOfWARC, quiet=True)
ipwb\indexer.py:171: in indexFileAt
cdxjLines += getCDXJLinesFromFile(
ipwb\indexer.py:274: in getCDXJLinesFromFile
ipfsHashes = pushToIPFS(hstr, payload)


hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Mon, 30 Jan 2017 18:39:49 GMT\r\nContent-Type: text/html\r\nConnection: close\r\nVary: Accept-Encoding'
payload = b'Memento for 2/2/2013 10:00am\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return

            httpHeaderIPFSHash = pushBytesToIPFS(hstr)
            payloadIPFSHash = pushBytesToIPFS(payload)

            if retryCount > 0:
                m = f'Retrying succeeded after {retryCount} attempts'
                print(m)
            return [httpHeaderIPFSHash, payloadIPFSHash]
        except NewConnectionError as e:
            print('IPFS daemon is likely not running.')
            print('Run "ipfs daemon" in another terminal session.')
          sys.exit()

E SystemExit

ipwb\indexer.py:91: SystemExit
------------------------------------------------ Captured stdout call -------------------------------------------------
Sample data not pulled from IPFS.
Check that the IPFS daemon is running.
IPWB replay started on http://localhost:5000

  • Serving Flask app "ipwb.replay" (lazy loading)
  • Environment: production
    WARNING: This is a development server. Do not use it in a production deployment.
    Use a production WSGI server instead.
  • Debug mode: off
    IPFS daemon is likely not running.
    Run "ipfs daemon" in another terminal session.
    ------------------------------------------------ Captured stderr call -------------------------------------------------
    Processing WARC records in 2mementos_queryString.warc: 1/3
    ______________________________________________ test_replay_dated_memento ______________________________________________

self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x008B18F8>

def _new_conn(self):
    extra_kw = {
            "family": self.family
    }
    if self.source_address:
            extra_kw['source_address'] = self.source_address

    if self.socket_options:
            extra_kw['socket_options'] = self.socket_options

    try:
            dns_host = getattr(self, "_dns_host", self.host)
          conn = create_connection(
                    (dns_host, self.port), self.timeout, **extra_kw)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:96:


address = ('localhost', 5001), timeout = <object object at 0x00986BF0>, source_address = None
socket_options = [(6, 1, 1)], family = <AddressFamily.AF_UNSPEC: 0>

def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
                      source_address=None, socket_options=None,
                      family=socket.AF_UNSPEC):
    host, port = address
    if host.startswith('['):
            host = host.strip('[]')
    err = None

    if not family or family == socket.AF_UNSPEC:
            family = urllib3.util.connection.allowed_gai_family()

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                    sock = socket.socket(af, socktype, proto)

                    # If provided, set socket level options before connecting.
                    if socket_options is not None:
                            for opt in socket_options:
                                    sock.setsockopt(*opt)

                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                    sock.connect(sa)
                    return sock
            except OSError as e:
                    err = e
                    if sock is not None:
                            sock.close()
                            sock = None

    if err is not None:
          raise err

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:66:


address = ('localhost', 5001), timeout = <object object at 0x00986BF0>, source_address = None
socket_options = [(6, 1, 1)], family = <AddressFamily.AF_UNSPEC: 0>

def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
                      source_address=None, socket_options=None,
                      family=socket.AF_UNSPEC):
    host, port = address
    if host.startswith('['):
            host = host.strip('[]')
    err = None

    if not family or family == socket.AF_UNSPEC:
            family = urllib3.util.connection.allowed_gai_family()

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                    sock = socket.socket(af, socktype, proto)

                    # If provided, set socket level options before connecting.
                    if socket_options is not None:
                            for opt in socket_options:
                                    sock.setsockopt(*opt)

                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                            sock.settimeout(timeout)
                    if source_address:
                            sock.bind(source_address)
                  sock.connect(sa)

E ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:57: ConnectionRefusedError

During handling of the above exception, another exception occurred:

hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Sat, 05 Mar 2016 19:22:47 GMT\r\nContent-Type: text/html\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nVary: Accept-Encoding'
payload = b'\n\n <title>HomePage | Sawood Alam</title>\n <link href="atom.xml" type="application/atom+xml" rel="al...ibrary, Web Archiving, Ruby on Rails, PHP, XHTML, CSS, JavaScript, ExtJS, Urdu, RTL and Linux.

\n\n\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return
          httpHeaderIPFSHash = pushBytesToIPFS(hstr)

ipwb\indexer.py:80:


bytes = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Sat, 05 Mar 2016 19:22:47 GMT\r\nContent-Type: text/html\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nVary: Accept-Encoding'

def pushBytesToIPFS(bytes):
    """
    Call the IPFS API to add the byte string to IPFS.
    When IPFS returns a hash, return this to the caller
    """
    global IPFS_API

    # Returns unicode in py2.7, str in py3.7
    try:
      res = IPFS_API.add_bytes(bytes)  # bytes)

ipwb\indexer.py:383:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Sat, 05 Mar 2016 19:22:47 GMT\r\nContent-Type: text/html\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nVary: Accept-Encoding')
kwargs = {}

@wraps(cmd)
def wrapper(*args, **kwargs):
    """Returns the specified field of the command invocation.

    Parameters
    ----------
    args : list
            Positional parameters to pass to the wrapped callable
    kwargs : dict
            Named parameter to pass to the wrapped callable
    """
  res = cmd(*args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\utils.py:148:


args = (<ipfshttpclient.client.Client object at 0x03F4ACB8>, b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Sat, 05 Mar 2016 19:22:47 GMT\r\nContent-Type: text/html\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nVary: Accept-Encoding')
kwargs = {}

@functools.wraps(func)
def wrapper2(*args: ty.Any, **kwargs: ty.Any) -> R:
  result = func(*args, **kwargs)  # type: T

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client\base.py:136:


self = <ipfshttpclient.client.Client object at 0x03F4ACB8>
data = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Sat, 05 Mar 2016 19:22:47 GMT\r\nContent-Type: text/html\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nVary: Accept-Encoding'
kwargs = {}, body = <generator object BytesFileStream.body at 0x04CF02C8>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="64842d4d95dd4f3490a47a2cc5ad3b25"'}

@utils.return_field('Hash')
@base.returns_single_item(dict)
def add_bytes(self, data, **kwargs):
    """Adds a set of bytes as a file to IPFS.

    .. code-block:: python

            >>> client.add_bytes(b"Mary had a little lamb")
            'QmZfF6C9j4VtoCsTp4KSrhYH47QMd3DNXVZBKaxJdhaPab'

    Also accepts and will stream generator objects.

    Parameters
    ----------
    data : bytes
            Content to be added as a file

    Returns
    -------
            str
                    Hash of the added IPFS object
    """
    body, headers = multipart.stream_bytes(data, chunk_size=self.chunk_size)
  return self._client.request('/add', decoder='json',
                                data=body, headers=headers, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\client_init_.py:257:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, path = '/add', args = []

def request(
            self, path: str,
            args: ty.Sequence[str] = [], *,
            opts: ty.Mapping[str, str] = {},
            decoder: str = "none",
            stream: bool = False,
            offline: bool = False,
            return_result: bool = True,
            auth: auth_t = None,
            cookies: cookies_t = None,
            data: reqdata_sync_t = None,
            headers: headers_t = None,
            timeout: timeout_t = None
) -> ty.Optional[ty.Union[  # noqa: ET122 (checker bug)
    StreamDecodeIteratorSync[bytes],
    StreamDecodeIteratorSync[object],
    bytes,
    ty.List[object],
]]:
    """Sends an HTTP request to the IPFS daemon

    This function returns the contents of the HTTP response from the IPFS
    daemon.

    Raises
    ------
    ~ipfshttpclient.exceptions.ErrorResponse
    ~ipfshttpclient.exceptions.ConnectionError
    ~ipfshttpclient.exceptions.ProtocolError
    ~ipfshttpclient.exceptions.StatusError
    ~ipfshttpclient.exceptions.TimeoutError

    Parameters
    ----------
    path
            The command path relative to the given base
    decoder
            The encoder to use to parse the HTTP response
    stream
            Whether to return an iterable yielding the received items incrementally
            instead of receiving and decoding all items up-front before returning
            them
    args
            Positional parameters to be sent along with the HTTP request
    opts
            Query string paramters to be sent along with the HTTP request
    offline
            Whether to request to daemon to handle this request in “offline-mode”
    return_result
            Whether to decode the values received from the daemon
    auth
            Authentication data to send along with this request as
            ``(username, password)`` tuple
    cookies
            HTTP cookies to send along with each request to the API daemon
    data
            Iterable yielding data to stream from the client to the daemon
    headers
            Custom HTTP headers to pass send along with the request
    timeout
            How many seconds to wait for the server to send data
            before giving up

            Set this to :py:`math.inf` to disable timeouts entirely.
    """
    # Don't attempt to decode response or stream
    # (which would keep an iterator open that will then never be waited for)
    if not return_result:
            decoder = None

    # HTTP method must always be "POST" since go-IPFS 0.5
    method = "POST"
    if "use_http_head_for_no_result" in self.workarounds and not return_result:  # pragma: no cover
            method = "HEAD"

    parser = encoding.get_encoding(decoder if decoder else "none")
  closables, res = self._request(
            method, path, map_args_to_params(args, opts, offline=offline),
            auth=auth, data=data, headers=headers, timeout=timeout,
            chunk_size=None,
    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_common.py:564:


self = <ipfshttpclient.http_requests.ClientSync object at 0x048B4528>, method = 'POST', path = 'add', params = []

def _request(
            self, method: str, path: str, params: ty.Sequence[ty.Tuple[str, str]], *,
            auth: auth_t,
            data: reqdata_sync_t,
            headers: headers_t,
            timeout: timeout_t,
            chunk_size: ty.Optional[int]
) -> ty.Tuple[ty.List[Closable], ty.Iterator[bytes]]:
    # Ensure path is relative so that it is resolved relative to the base
    while path.startswith("/"):
            path = path[1:]

    url = urllib.parse.urljoin(self._base_url, path)

    try:
            # Determine session object to use
            closables, session = self._access_session()

            # Do HTTP request (synchronously) and map exceptions
            try:
                  res = session.request(
                            method=method,
                            url=url,
                            **map_args_to_requests(
                                    params=params,
                                    auth=auth,
                                    headers=headers,
                                    timeout=timeout,
                            ),
                            data=data,
                            stream=True,
                    )

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\http_requests.py:152:


self = <ipfshttpclient.requests_wrapper.Session object at 0x008B1718>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', args = ()
kwargs = {'data': <generator object BytesFileStream.body at 0x04CF02C8>, 'headers': {'Content-Disposition': 'form-data; name="f...s"', 'Content-Type': 'multipart/form-data; boundary="64842d4d95dd4f3490a47a2cc5ad3b25"'}, 'params': {}, 'stream': True}
family = <AddressFamily.AF_UNSPEC: 0>

def request(self, method, url, *args, **kwargs):
    family = kwargs.pop("family", self.family)
    if family != socket.AF_UNSPEC:
            # Inject provided address family value as extension to scheme
            url = urllib.parse.urlparse(url)
            url = url._replace(scheme="{0}+{1}".format(url.scheme, AF2NAME[int(family)]))
            url = url.geturl()
  return super().request(method, url, *args, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:219:


self = <ipfshttpclient.requests_wrapper.Session object at 0x008B1718>, method = 'POST'
url = 'http://localhost:5001/api/v0/add', params = {}, data = <generator object BytesFileStream.body at 0x04CF02C8>
headers = {'Content-Disposition': 'form-data; name="file"; filename="bytes"', 'Content-Type': 'multipart/form-data; boundary="64842d4d95dd4f3490a47a2cc5ad3b25"'}
cookies = None, files = None, auth = None, timeout = None, allow_redirects = True, proxies = {}, hooks = None
stream = True, verify = None, cert = None, json = None

def request(self, method, url,
        params=None, data=None, headers=None, cookies=None, files=None,
        auth=None, timeout=None, allow_redirects=True, proxies=None,
        hooks=None, stream=None, verify=None, cert=None, json=None):
    """Constructs a :class:`Request <Request>`, prepares it and sends it.
    Returns :class:`Response <Response>` object.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query
        string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) json to send in the body of the
        :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the
        :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the
        :class:`Request`.
    :param files: (optional) Dictionary of ``'filename': file-like-objects``
        for multipart encoding upload.
    :param auth: (optional) Auth tuple or callable to enable
        Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Set to True by default.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol or protocol and
        hostname to the URL of the proxy.
    :param stream: (optional) whether to immediately download the response
        content. Defaults to ``False``.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
        the server's TLS certificate, or a string, in which case it must be a path
        to a CA bundle to use. Defaults to ``True``.
    :param cert: (optional) if String, path to ssl client cert file (.pem).
        If Tuple, ('cert', 'key') pair.
    :rtype: requests.Response
    """
    # Create the Request.
    req = Request(
        method=method.upper(),
        url=url,
        headers=headers,
        files=files,
        data=data or {},
        json=json,
        params=params or {},
        auth=auth,
        cookies=cookies,
        hooks=hooks,
    )
    prep = self.prepare_request(req)

    proxies = proxies or {}

    settings = self.merge_environment_settings(
        prep.url, proxies, stream, verify, cert
    )

    # Send the request.
    send_kwargs = {
        'timeout': timeout,
        'allow_redirects': allow_redirects,
    }
    send_kwargs.update(settings)
  resp = self.send(prep, **send_kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:530:


self = <ipfshttpclient.requests_wrapper.Session object at 0x008B1718>, request = <PreparedRequest [POST]>
kwargs = {'cert': None, 'proxies': OrderedDict(), 'stream': True, 'timeout': None, ...}, allow_redirects = True
stream = True, hooks = {'response': []}, adapter = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x008B1520>
start = 206.4701097

def send(self, request, **kwargs):
    """Send a given PreparedRequest.

    :rtype: requests.Response
    """
    # Set defaults that the hooks can utilize to ensure they always have
    # the correct parameters to reproduce the previous request.
    kwargs.setdefault('stream', self.stream)
    kwargs.setdefault('verify', self.verify)
    kwargs.setdefault('cert', self.cert)
    kwargs.setdefault('proxies', self.proxies)

    # It's possible that users might accidentally send a Request object.
    # Guard against that specific failure case.
    if isinstance(request, Request):
        raise ValueError('You can only send PreparedRequests.')

    # Set up variables needed for resolve_redirects and dispatching of hooks
    allow_redirects = kwargs.pop('allow_redirects', True)
    stream = kwargs.get('stream')
    hooks = request.hooks

    # Get the appropriate adapter to use
    adapter = self.get_adapter(url=request.url)

    # Start time (approximately) of the request
    start = preferred_clock()

    # Send the request
  r = adapter.send(request, **kwargs)

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py:643:


self = <ipfshttpclient.requests_wrapper.HTTPAdapter object at 0x008B1520>, request = <PreparedRequest [POST]>
stream = True, timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None
proxies = OrderedDict()

def send(self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None):
    """Sends PreparedRequest object. Returns Response object.

    :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
    :param stream: (optional) Whether to stream the request content.
    :param timeout: (optional) How long to wait for the server to send
        data before giving up, as a float, or a :ref:`(connect timeout,
        read timeout) <timeouts>` tuple.
    :type timeout: float or tuple or urllib3 Timeout object
    :param verify: (optional) Either a boolean, in which case it controls whether
        we verify the server's TLS certificate, or a string, in which case it
        must be a path to a CA bundle to use
    :param cert: (optional) Any user-provided SSL certificate to be trusted.
    :param proxies: (optional) The proxies dictionary to apply to the request.
    :rtype: requests.Response
    """

    try:
        conn = self.get_connection(request.url, proxies)
    except LocationValueError as e:
        raise InvalidURL(e, request=request)

    self.cert_verify(conn, request.url, verify, cert)
    url = self.request_url(request, proxies)
    self.add_headers(request, stream=stream, timeout=timeout, verify=verify, cert=cert, proxies=proxies)

    chunked = not (request.body is None or 'Content-Length' in request.headers)

    if isinstance(timeout, tuple):
        try:
            connect, read = timeout
            timeout = TimeoutSauce(connect=connect, read=read)
        except ValueError as e:
            # this may raise a string formatting error.
            err = ("Invalid timeout {}. Pass a (connect, read) "
                   "timeout tuple, or a single float to set "
                   "both timeouts to the same value".format(timeout))
            raise ValueError(err)
    elif isinstance(timeout, TimeoutSauce):
        pass
    else:
        timeout = TimeoutSauce(connect=timeout, read=timeout)

    try:
        if not chunked:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout
            )

        # Send the request.
        else:
            if hasattr(conn, 'proxy_pool'):
                conn = conn.proxy_pool

            low_conn = conn._get_conn(timeout=DEFAULT_POOL_TIMEOUT)

            try:
                low_conn.putrequest(request.method,
                                    url,
                                    skip_accept_encoding=True)

                for header, value in request.headers.items():
                    low_conn.putheader(header, value)
              low_conn.endheaders()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\adapters.py:467:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x008B18F8>, message_body = None

def endheaders(self, message_body=None, *, encode_chunked=False):
    """Indicate that the last header line has been sent to the server.

    This method sends the request to the server.  The optional message_body
    argument can be used to pass a message body associated with the
    request.
    """
    if self.__state == _CS_REQ_STARTED:
        self.__state = _CS_REQ_SENT
    else:
        raise CannotSendHeader()
  self._send_output(message_body, encode_chunked=encode_chunked)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1235:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x008B18F8>, message_body = None
encode_chunked = False

def _send_output(self, message_body=None, encode_chunked=False):
    """Send the currently buffered request and clear the buffer.

    Appends an extra \\r\\n to the buffer.
    A message_body may be specified, to be appended to the request.
    """
    self._buffer.extend((b"", b""))
    msg = b"\r\n".join(self._buffer)
    del self._buffer[:]
  self.send(msg)

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:1006:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x008B18F8>
data = b'POST /api/v0/add?stream-channels=true HTTP/1.1\r\nHost: localhost:5001\r\nContent-Disposition: form-data; name="file...\nContent-Type: multipart/form-data; boundary="64842d4d95dd4f3490a47a2cc5ad3b25"\r\nTransfer-Encoding: chunked\r\n\r\n'

def send(self, data):
    """Send `data' to the server.
    ``data`` can be a string object, a bytes object, an array object, a
    file-like object that supports a .read() method, or an iterable object.
    """

    if self.sock is None:
        if self.auto_open:
          self.connect()

....\AppData\Local\Programs\Python\Python38-32\lib\http\client.py:946:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x008B18F8>

def connect(self):
  conn = self._new_conn()

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\urllib3\connection.py:187:


self = <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x008B18F8>

def _new_conn(self):
    extra_kw = {
            "family": self.family
    }
    if self.source_address:
            extra_kw['source_address'] = self.source_address

    if self.socket_options:
            extra_kw['socket_options'] = self.socket_options

    try:
            dns_host = getattr(self, "_dns_host", self.host)
            conn = create_connection(
                    (dns_host, self.port), self.timeout, **extra_kw)
    except socket.timeout:
            raise urllib3.exceptions.ConnectTimeoutError(
                    self, "Connection to %s timed out. (connect timeout=%s)" %
                    (self.host, self.timeout))
    except OSError as e:
          raise urllib3.exceptions.NewConnectionError(
                    self, "Failed to establish a new connection: %s" % e)

E urllib3.exceptions.NewConnectionError: <ipfshttpclient.requests_wrapper.HTTPConnection object at 0x008B18F8>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it

....\AppData\Local\Programs\Python\Python38-32\lib\site-packages\ipfshttpclient\requests_wrapper.py:103: NewConnectionError

During handling of the above exception, another exception occurred:

def test_replay_dated_memento():
  ipwbTest.startReplay('salam-home.warc')

tests\test_replay.py:65:


tests\testUtil.py:69: in startReplay
cdxjList = indexer.indexFileAt(pathOfWARC, quiet=True)
ipwb\indexer.py:171: in indexFileAt
cdxjLines += getCDXJLinesFromFile(
ipwb\indexer.py:274: in getCDXJLinesFromFile
ipfsHashes = pushToIPFS(hstr, payload)


hstr = b'HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Sat, 05 Mar 2016 19:22:47 GMT\r\nContent-Type: text/html\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nVary: Accept-Encoding'
payload = b'\n\n <title>HomePage | Sawood Alam</title>\n <link href="atom.xml" type="application/atom+xml" rel="al...ibrary, Web Archiving, Ruby on Rails, PHP, XHTML, CSS, JavaScript, ExtJS, Urdu, RTL and Linux.

\n\n\n'

def pushToIPFS(hstr, payload):
    ipfsRetryCount = 5  # WARC->IPFS attempts before giving up
    retryCount = 0
    while retryCount < ipfsRetryCount:
        try:
            # Py 2/3 str/unicode/byte resolution
            if isinstance(hstr, str):
                hstr = s2b(hstr)
            if isinstance(payload, str):
                payload = s2b(payload)

            if len(payload) == 0:  # py-ipfs-api issue #137
                return

            httpHeaderIPFSHash = pushBytesToIPFS(hstr)
            payloadIPFSHash = pushBytesToIPFS(payload)

            if retryCount > 0:
                m = f'Retrying succeeded after {retryCount} attempts'
                print(m)
            return [httpHeaderIPFSHash, payloadIPFSHash]
        except NewConnectionError as e:
            print('IPFS daemon is likely not running.')
            print('Run "ipfs daemon" in another terminal session.')
          sys.exit()

E SystemExit

ipwb\indexer.py:91: SystemExit
------------------------------------------------ Captured stdout call -------------------------------------------------
Sample data not pulled from IPFS.
Check that the IPFS daemon is running.
IPWB replay started on http://localhost:5000

  • Serving Flask app "ipwb.replay" (lazy loading)

  • Environment: production
    WARNING: This is a development server. Do not use it in a production deployment.
    Use a production WSGI server instead.

  • Debug mode: off
    IPFS daemon is likely not running.
    Run "ipfs daemon" in another terminal session.
    ------------------------------------------------ Captured stderr call -------------------------------------------------
    Processing WARC records in salam-home.warc: 2/6
    _______________________________________________ test_unit_commandDaemon _______________________________________________

    def test_unit_commandDaemon():

  replay.commandDaemon('start')

tests\test_replay.py:185:


ipwb\replay.py:141: in commandDaemon
subprocess.Popen(['ipfs', 'daemon'])
....\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py:854: in init
self._execute_child(args, executable, preexec_fn, close_fds,


self = <subprocess.Popen object at 0x04DF9538>, args = 'ipfs daemon', executable = None, preexec_fn = None
close_fds = True, pass_fds = (), cwd = None, env = None, startupinfo = <subprocess.STARTUPINFO object at 0x008B0238>
creationflags = 0, shell = False, p2cread = -1, p2cwrite = -1, c2pread = -1, c2pwrite = -1, errread = -1, errwrite = -1
unused_restore_signals = True, unused_start_new_session = False

def _execute_child(self, args, executable, preexec_fn, close_fds,
                   pass_fds, cwd, env,
                   startupinfo, creationflags, shell,
                   p2cread, p2cwrite,
                   c2pread, c2pwrite,
                   errread, errwrite,
                   unused_restore_signals, unused_start_new_session):
    """Execute program (MS Windows version)"""

    assert not pass_fds, "pass_fds not supported on Windows."

    if isinstance(args, str):
        pass
    elif isinstance(args, bytes):
        if shell:
            raise TypeError('bytes args is not allowed on Windows')
        args = list2cmdline([args])
    elif isinstance(args, os.PathLike):
        if shell:
            raise TypeError('path-like args is not allowed when '
                            'shell is true')
        args = list2cmdline([args])
    else:
        args = list2cmdline(args)

    if executable is not None:
        executable = os.fsdecode(executable)

    # Process startup details
    if startupinfo is None:
        startupinfo = STARTUPINFO()
    else:
        # bpo-34044: Copy STARTUPINFO since it is modified above,
        # so the caller can reuse it multiple times.
        startupinfo = startupinfo.copy()

    use_std_handles = -1 not in (p2cread, c2pwrite, errwrite)
    if use_std_handles:
        startupinfo.dwFlags |= _winapi.STARTF_USESTDHANDLES
        startupinfo.hStdInput = p2cread
        startupinfo.hStdOutput = c2pwrite
        startupinfo.hStdError = errwrite

    attribute_list = startupinfo.lpAttributeList
    have_handle_list = bool(attribute_list and
                            "handle_list" in attribute_list and
                            attribute_list["handle_list"])

    # If we were given an handle_list or need to create one
    if have_handle_list or (use_std_handles and close_fds):
        if attribute_list is None:
            attribute_list = startupinfo.lpAttributeList = {}
        handle_list = attribute_list["handle_list"] = \
            list(attribute_list.get("handle_list", []))

        if use_std_handles:
            handle_list += [int(p2cread), int(c2pwrite), int(errwrite)]

        handle_list[:] = self._filter_handle_list(handle_list)

        if handle_list:
            if not close_fds:
                warnings.warn("startupinfo.lpAttributeList['handle_list'] "
                              "overriding close_fds", RuntimeWarning)

            # When using the handle_list we always request to inherit
            # handles but the only handles that will be inherited are
            # the ones in the handle_list
            close_fds = False

    if shell:
        startupinfo.dwFlags |= _winapi.STARTF_USESHOWWINDOW
        startupinfo.wShowWindow = _winapi.SW_HIDE
        comspec = os.environ.get("COMSPEC", "cmd.exe")
        args = '{} /c "{}"'.format (comspec, args)

    if cwd is not None:
        cwd = os.fsdecode(cwd)

    sys.audit("subprocess.Popen", executable, args, cwd, env)

    # Start the process
    try:
      hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                                 # no special security
                                 None, None,
                                 int(not close_fds),
                                 creationflags,
                                 env,
                                 cwd,
                                 startupinfo)

E FileNotFoundError: [WinError 2] The system cannot find the file specified

....\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py:1307: FileNotFoundError
=============================================== short test summary info ===============================================
FAILED tests/test_backends.py::test_local - requests.exceptions.InvalidSchema: No connection adapters were found for ...
FAILED tests/test_backends.py::test_ipfs_failure - ipfshttpclient.exceptions.ConnectionError: ConnectionError: HTTPCo...
FAILED tests/test_indexing.py::test_cdxj_warc_responseRecordCount - SystemExit
FAILED tests/test_indexing.py::test_warc_ipwbIndexerBrokenWARCRecord - SystemExit
FAILED tests/test_memento.py::test_acceptdatetime_status[5mementos.warc-timegate/memento.us-Thu, 31 May 2007 20:35:00 GMT-302]
FAILED tests/test_memento.py::test_acceptdatetime_status[5mementos.warc-timegate/memento.us-Thu, 31 May 2007 20:35:00-400]
FAILED tests/test_memento.py::test_acceptdatetime_status[5mementos.warc-timegate/memento.us-Thu, 31 May 2007 20:35 GMT-400]
FAILED tests/test_memento.py::test_acceptdatetime_status[5mementos.warc-timegate/memento.us-20181001123636-400] - Sys...
FAILED tests/test_memento.py::test_mementoRelations_one - SystemExit
FAILED tests/test_memento.py::test_mementoRelations_two - SystemExit
FAILED tests/test_memento.py::test_mementoRelations_three - SystemExit
FAILED tests/test_memento.py::test_mementoRelations_four - SystemExit
FAILED tests/test_memento.py::test_mementoRelations_five - SystemExit
FAILED tests/test_randomized_add.py::test_push - SystemExit
FAILED tests/test_replay.py::test_replay_404[HTTP404.warc-memento/20200202100000/memento.us/-True] - SystemExit
FAILED tests/test_replay.py::test_replay_404[HTTP404.warc-memento/20200202100000/memento.ca/-False] - SystemExit
FAILED tests/test_replay.py::test_replay_404[HTTP404.warc-loremipsum-False] - SystemExit
FAILED tests/test_replay.py::test_replay_search[salam-home.warc-memento//cs.odu.edu/~salam/-302-/memento/20160305192247/cs.odu.edu/~salam/]
FAILED tests/test_replay.py::test_replay_search[1memento.warc-memento/
/memento.us-302-/memento/20130202100000/memento.us/]
FAILED tests/test_replay.py::test_replay_search[2mementos.warc-memento//memento.us-200-None] - SystemExit
FAILED tests/test_replay.py::test_replay_search[salam-home.warc-memento/
/?url=cs.odu.edu/~salam/-301-/memento//cs.odu.edu/~salam/]
FAILED tests/test_replay.py::test_replay_search[1memento.warc-memento/
/?url=memento.us-301-/memento//memento.us] - ...
FAILED tests/test_replay.py::test_replay_search[2mementos.warc-memento/
/?url=memento.us-301-/memento/*/memento.us]
FAILED tests/test_replay.py::test_replay_search[2mementos_queryString.warc-/memento/20130202100000/memento.us/index.php?anotherval=ipsum&someval=lorem-200-None]
FAILED tests/test_replay.py::test_replay_dated_memento - SystemExit
FAILED tests/test_replay.py::test_unit_commandDaemon - FileNotFoundError: [WinError 2] The system cannot find the fil...
================================ 26 failed, 72 passed, 16 skipped in 220.89s (0:03:40) ================================

It might be good to start the daemon and run the tests individually to isolate the problem.

@machawk1
Copy link
Member

machawk1 commented Jul 3, 2020

Dropping to the debugger on the first failure might be informative, e.g. (view raw),

Output
> pytest -x --pdb
================================================= test session starts =================================================
platform win32 -- Python 3.8.3, pytest-5.4.3, py-1.9.0, pluggy-0.13.1
rootdir: C:\Users\mrk335\Desktop\ipwb-master, inifile: setup.cfg
plugins: cov-2.10.0, flake8-1.0.6
collected 114 items

tests\test_backends.py F

traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

def test_local():
  assert get_web_archive_index(SAMPLE_INDEX).startswith(
        '!context ["http://tools.ietf.org/html/rfc7089"]'
    )

tests\test_backends.py:16:


ipwb\backends.py:87: in get_web_archive_index
response = fetch_web_index(path)
ipwb\backends.py:53: in fetch_web_index
return requests.get(path).text
....\appdata\local\programs\python\python38-32\lib\site-packages\requests\api.py:76: in get
return request('get', url, params=params, **kwargs)
....\appdata\local\programs\python\python38-32\lib\site-packages\requests\api.py:61: in request
return session.request(method=method, url=url, **kwargs)
....\appdata\local\programs\python\python38-32\lib\site-packages\requests\sessions.py:530: in request
resp = self.send(prep, **send_kwargs)
....\appdata\local\programs\python\python38-32\lib\site-packages\requests\sessions.py:637: in send
adapter = self.get_adapter(url=request.url)


self = <requests.sessions.Session object at 0x053BE7C0>
url = 'C:\Users\mrk335\Desktop\ipwb-master\samples\indexes\salam-home.cdxj'

def get_adapter(self, url):
    """
    Returns the appropriate connection adapter for the given URL.

    :rtype: requests.adapters.BaseAdapter
    """
    for (prefix, adapter) in self.adapters.items():

        if url.lower().startswith(prefix.lower()):
            return adapter

    # Nothing matches :-/
  raise InvalidSchema("No connection adapters were found for {!r}".format(url))

E requests.exceptions.InvalidSchema: No connection adapters were found for 'C:\Users\mrk335\Desktop\ipwb-master\samples\indexes\salam-home.cdxj'

....\appdata\local\programs\python\python38-32\lib\site-packages\requests\sessions.py:730: InvalidSchema

entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
c:\users\mrk335\appdata\local\programs\python\python38-32\lib\site-packages\requests\sessions.py(730)get_adapter()
-> raise InvalidSchema("No connection adapters were found for {!r}".format(url))

Failing test:

def test_local():
    assert get_web_archive_index(SAMPLE_INDEX).startswith(
        '!context ["http://tools.ietf.org/html/rfc7089"]'
    )

Perhaps it is trying to read the whole string as a URI and indicating what it is parsing as the scheme (!content) as invalid.

I presume testing "local" means passing the function a string and seeing how it behaves, which might be indicative of the test rightly failing because of the questionable support for this in backends.py.

EDIT: still odd that this passes on non-Windows OSes (e.g., macOS, unable to replicate).

@machawk1
Copy link
Member

machawk1 commented Jul 3, 2020

get_web_archive_index() handles an index define by 1: IPFS Multihash address, 2: web URI, 3. local file/path but does not interpret the string !context ["http://tools.ietf.org/html/rfc7089"]. Further testing is needed to determine which code path !context ["http://tools.ietf.org/html/rfc7089"] is taking from these three ordered options.

Update: It's path 1, the call to fetch_ipfs_index(), so on macOS, !context ["http://tools.ietf.org/html/rfc7089"] is treated as a successful multihash but on Windows, it fails.

@ibnesayeed
Copy link
Member Author

@machawk1 do you think fixing #687 might improve the situation here?

@machawk1
Copy link
Member

machawk1 commented Jul 3, 2020

It might help, @ibnesayeed, but I am curious as to why it is considered a hash on macOS but not Windows. This might also point to a bug in py-ipfs-http-client. I am hoping to investigate further.

@ibnesayeed
Copy link
Member Author

@machawk1 did you try to investigate these issues on atomic level in the interactive Python prompt in Windows?

@machawk1
Copy link
Member

machawk1 commented Jul 3, 2020

@ibnesayeed What I tried is documented above. More to come.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants