O solucionador pode codificar isso para você

Solucionador:

Vamos adicionar um caso de teste primeiro para verificar o problema com várias barras e autenticação.

Vamos verificar primeiro o arquivo de teste existente para ver onde devemos adicionar nosso teste.

Leitura testes/test_requests.py

Vamos adicionar nosso caso de teste ao TestRequests aula. Primeiro, vamos encontrar uma boa localização perto de outros testes relacionados à URL.

Procurando test_path

Perfeito, vamos adicionar nosso caso de teste logo após o test_path_is_not_double_encoded teste.

Idg

Agora vamos modificar o prepare_url método em models.py para preservar várias barras. A questão provavelmente está em como o caminho é tratado antes de ser passado para urlunparse.

Idg

Observe que um nulo é uma escolha suspeita de marcador.

Solucionador:

Vamos executar os testes para verificar nossas alterações corrigindo o problema. Usaremos Pytest Para executar apenas nosso novo caso de teste.


$ python -m pytest tests/test_requests.py -k test_path_with_multiple_slashes -v
============================= test session starts ==============================
platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /repo
configfile: pyproject.toml
plugins: cov-5.0.0, httpbin-2.1.0
collecting ... collected 330 items / 329 deselected / 1 selected

tests/test_requests.py::TestRequests::test_path_with_multiple_slashes FAILED (100%)

=================================== FAILURES ===================================
_________________ TestRequests.test_path_with_multiple_slashes _________________

self = 

    def test_path_with_multiple_slashes(self):
        """Test that URLs with multiple slashes are preserved."""
        url = "http://example.com//path//to//resource"
        request = requests.Request("GET", url).prepare()
>       assert request.url == url
E       AssertionError: assert 'http://examp...to%00resource' == 'http://examp.../to//resource'
E         
E         - http://example.com//path//to//resource
E         ?                    ^    ^^  ^^
E         + http://example.com/%00path%00to%00resource
E         ?                    ^^^    ^^^  ^^^

tests/test_requests.py:151: AssertionError
=========================== short test summary info ============================
FAILED tests/test_requests.py::TestRequests::test_path_with_multiple_slashes
====================== 1 failed, 329 deselected in 0.75s =======================
Exit Code: 1

A falha do teste mostra que nossa abordagem de substituição de bytes nula não está funcionando corretamente – os bytes nulos estão obtendo o URL codificado. Vamos tentar uma abordagem diferente usando um marcador temporário mais seguro de URL.

Idg

Precisamos adicionar o ré importar na parte superior do arquivo.

Vamos verificar as importações atuais primeiro.

Leitura src/solicitações/modelos.py

Eu vejo que podemos adicionar ré para as importações existentes. Vamos adicioná -lo com as outras importações de biblioteca padrão na parte superior.

Idg

Agora, vamos executar o teste novamente para ver se nossa nova abordagem funciona.


$ python -m pytest tests/test_requests.py -k test_path_with_multiple_slashes -v
============================= test session starts ==============================
platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /repo
configfile: pyproject.toml
plugins: httpbin-2.1.0, cov-5.0.0
collecting ... collected 330 items / 329 deselected / 1 selected

tests/test_requests.py::TestRequests::test_path_with_multiple_slashes FAILED (100%)

=================================== FAILURES ===================================
_________________ TestRequests.test_path_with_multiple_slashes _________________

self = 

    def test_path_with_multiple_slashes(self):
        """Test that URLs with multiple slashes are preserved."""
        url = "http://example.com//path//to//resource"
>       request = requests.Request("GET", url).prepare()

tests/test_requests.py:150: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/requests/models.py:299: in prepare
    p.prepare(
src/requests/models.py:368: in prepare
    self.prepare_url(url, params)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = , url="http://example.com/path/to/resource"
params = {}

    def prepare_url(self, url, params):
        """Prepares the given HTTP URL."""
        #: Accept objects that have string representations.
        #: We're unable to blindly call unicode/str functions
        #: as this will include the bytestring indicator (b'')
        #: on python 3.x.
        #: https://github.com/psf/requests/pull/2238
        if isinstance(url, bytes):
            url = url.decode("utf8")
        else:
            url = str(url)
    
        # Remove leading whitespaces from url
        url = url.lstrip()
    
        # Don't do any URL preparation for non-HTTP schemes like `mailto`,
        # `data` etc to work around exceptions from `url_parse`, which
        # handles RFC 3986 only.
        if ":" in url and not url.lower().startswith("http"):
            self.url = url
            return
    
        # Support for unicode domain names and paths.
        try:
            scheme, auth, host, port, path, query, fragment = parse_url(url)
        except LocationParseError as e:
            raise InvalidURL(*e.args)
    
        if not scheme:
            raise MissingSchema(
                f"Invalid URL {url!r}: No scheme supplied. "
                f"Perhaps you meant https://{url}?"
            )
    
        if not host:
            raise InvalidURL(f"Invalid URL {url!r}: No host supplied")
    
        # In general, we want to try IDNA encoding the hostname if the string contains
        # non-ASCII characters. This allows users to automatically get the correct IDNA
        # behaviour. For strings containing only ASCII characters, we need to also verify
        # it doesn't start with a wildcard (*), before allowing the unencoded hostname.
        if not unicode_is_ascii(host):
            try:
                host = self._get_idna_encoded_host(host)
            except UnicodeError:
                raise InvalidURL("URL has an invalid label.")
        elif host.startswith(("*", ".")):
            raise InvalidURL("URL has an invalid label.")
    
        # Carefully reconstruct the network location
        netloc = auth or ""
        if netloc:
            netloc += "@"
        netloc += host
        if port:
            netloc += f":{port}"
    
        # Bare domains aren't valid URLs.
        if not path:
            path = "/"
        else:
            # Preserve multiple consecutive slashes in the path
            original_path = path
            path = re.sub("//+", "/", path)  # Normalize to single slashes for parsing
    
        if isinstance(params, (str, bytes)):
            params = to_native_string(params)
    
        enc_params = self._encode_params(params)
        if enc_params:
            if query:
                query = f"{query}&{enc_params}"
            else:
                query = enc_params
    
        url = requote_uri(urlunparse((scheme, netloc, path, None, query, fragment)))
    
        # Restore multiple slashes if they were present in the original path
        if path != "/" and "original_path" in locals():
            # Find the path portion in the final URL and replace it
>           url_parts = list(urlparse(url))
E           NameError: name 'urlparse' is not defined

src/requests/models.py:490: NameError
=========================== short test summary info ============================
FAILED tests/test_requests.py::TestRequests::test_path_with_multiple_slashes
====================== 1 failed, 329 deselected in 0.55s =======================
Exit Code: 1

Ah, precisamos usar o importado urlparse. Vamos corrigir as importações e tentar uma abordagem mais simples.

Idg

Vamos executar o teste novamente para ver se essa abordagem mais simples funciona.


$ python -m pytest tests/test_requests.py -k test_path_with_multiple_slashes -v
============================= test session starts ==============================
platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /repo
configfile: pyproject.toml
plugins: httpbin-2.1.0, cov-5.0.0
collecting ... collected 330 items / 329 deselected / 1 selected

tests/test_requests.py::TestRequests::test_path_with_multiple_slashes PASSED (100%)

====================== 1 passed, 329 deselected in 0.62s =======================
Exit Code: 0

O teste passa. A solução mais simples funciona porque estamos deixando os componentes da URL passarem sem qualquer normalização. Vamos enviar nossas alterações que incluem:

Adicionado um caso de teste para verificar várias barras é preservado em URLs com e sem autenticação
Modificado prepare_url preservar a estrutura do caminho exata sem normalização
Adicionou um comentário explicando a preservação de várias barras

Solver terminado

Discussão com um fundador

Em algum lugar no meio da sessão mostrado na seção anterior, você verá que eu pressionei o botão Continuar resolvendo e disse ao Solver: “Você está no caminho certo; Por favor, continue. ”

Eu me perguntei por que Solver parou aqui para esperar pela entrada do usuário, então enviei um relatório de problemas e perguntei: “Por que o solucionador para depois de identificar o problema? Sim, continuou quando eu pressionei o botão apropriado. Esta é uma pergunta, não um relatório de bug. ”