Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOCKS problem with 8.7.0+ #13455

Closed
lazulkia opened this issue Apr 23, 2024 · 21 comments
Closed

SOCKS problem with 8.7.0+ #13455

lazulkia opened this issue Apr 23, 2024 · 21 comments

Comments

@lazulkia
Copy link

lazulkia commented Apr 23, 2024

I did this

When 8.7.0 was released, I noticed that my script for updating mpv's was failing. It uses curl_multi and downloads the mpv.zip file in 6 chunks. It seems that some EASY task is stuck in an infinite loop. The script had been working fine for years before.
I didn't have the energy to find out why at the time, so I fell back to 8.6.0.
When 8.7.1 was released I tested it again and still had the problem.
Until now, I tried to find out why.

First of all it should be noted that I am not a programmer, I just use AutoHotkey to call libcurl.dll. I tested libcurl.dll from two sources
https://curl.se/windows/
https://community.chocolatey.org/packages/curl

I tested still chunked downloads and the script was modified from this example:
https://curl.se/libcurl/c/10-at-a-time.html

I can only access the Internets properly with a vps.
I have to set the proxy to socks5://127.0.0.1:5000 to access google.
Test file:
https://dl.google.com/drive-file-stream/GoogleDriveSetup.exe

8.7.1 - Test results for libcurl.dll:
Using curl_easy can download normally.
Using curl_multi * 6-split the above problem occurs.

I changed the download file to an address that I can download normally, but still using socks5://127.0.0.1:5000
Using curl_easy downloads fine.
Using curl_multi * 6-split downloads fine.

The conclusion I've come to is that after one of curl_multi's sub-easy tasks finishes, calling curl_multi_remove_handle() or curl_easy_cleanup() destroys the proxy settings of the other sub-easy tasks that are still running. So google files that must be downloaded using a proxy are constantly trying to download. These 6 EASY, except for the different Range-bytes data, have exactly the same mission information. This may also be the reason why one mission ends and the agents for the other missions are corrupted. I used curl_easy_getinfo(CURLINFO_USED_PROXY ) to look at the tasks with broken proxies, and the returned value is still 1.
And files that can be downloaded normally have no effect (or even finish downloading faster) when the proxy is disabled

Unless the developer has such a complex network environment, this problem is hard to detect.
Of course, I'm not sure if it's some new update feature I'm missing, but I double checked the changelog and couldn't find anything about it.

I expected the following

Consistent with the behavior of 8.6.

curl/libcurl version

8.7.0+

operating system

win-10 64bit

@bagder bagder changed the title A problem with 8.7.0+ SOCKS problem with 8.7.0+ Apr 24, 2024
@bagder bagder self-assigned this Apr 25, 2024
@bagder
Copy link
Member

bagder commented Apr 25, 2024

The conclusion I've come to is that after one of curl_multi's sub-easy tasks finishes, calling curl_multi_remove_handle() or curl_easy_cleanup() destroys the proxy settings of the other sub-easy tasks that are still running

No, there is no such thing. That's a wild guess that cannot be correct. Also, I could reproduce this problem even with all the curl_multi_remove_handle() and curl_easy_cleanup() calls removed from the reproducer code.

In my testing in reproducing this issue, doing SOCKS5 seems shaky in 8.6.0 as well, compared to not using the proxy. Thus, I don't think 8.6.0 had this working correctly either, only that perhaps 8.7.x made it more visible.

I have not yet figured out what the problem is. The symptom is timed out transfers that otherwise would not timeout.

@lazulkia
Copy link
Author

I'm glad you reproduced the problem.

Some additional notes:
The "8.6.0-" version runs the same script for a long time without a similar problem. Or the chances of it happening are very low, or maybe it happened once but was assumed by me to be a problem with the VPS.
I tried to use https proxy and the problem didn't improve.

@bagder
Copy link
Member

bagder commented Apr 25, 2024

I suspect I don't reproduce the problem you report. If you can see the same problem with HTTPS proxy, then you see something completely different. They use very different code paths.

Can you clarify exactly how the transfers fail? Does transfer 2-6 always fail once the first one is completed?

I figured out the reason why some of my proxy connections timed out: https://opensource.org does not work over IPv6. When connecting to it over a SOCKS proxy, curl resolves the host name and uses only a single connect attempt to it with IPv6, which fails. It fails also without any proxy.

But when connecting directly to a host, without a proxy, curl does happy eyeballs which tries both IPv6 and IPv4 and thus the failed IPv6 attempt is not visible.

@lazulkia
Copy link
Author

lazulkia commented Apr 25, 2024

Test results, there will be 10-20% normal downloads.
But my test download was a small 1.3M file. This could also be my test method I problem. I initially created only one task, and added the other 5 after getting the "content-length". In other words, task-1, using the request header "Range=bytes=0-", I would disconnect task-1 in the callback for received data based on the size of the received data. This means that task-1 is passively disconnected.

Problematic tests, 80% are 5 remaining tasks caught in a loop. Other times, sometimes it's 2 caught in a loop, sometimes it's 3...

https://dl.google.com/tag/s/appguid%3D%7B8A69D345-D564-463C-AFF1-A69D9E530F96%7D%26iid%3D%7B54CDD788-B7E5-D004-1752-C6F8496A667F%7D%26lang%3Den%26browser%3D3%26usagestats%3D1%26appname%3DGoogle%2520Chrome%26needsadmin%3Dprefers%26ap%3Dx64-stable-statsdef_1%26brand%3DCHZO%26installdataindex%3Dempty/update2/installers/ChromeSetup.exe

@lazulkia
Copy link
Author

lazulkia commented Apr 25, 2024

I found a way to download 100% properly. Task-1, even if it downloads the allocated data size, it doesn't disconnect and still downloads all the data. The other tasks then don't get stuck in a loop. That means the code that triggers this is, task-1
write_callback_function(){
if this.receive_size >= this.task_size
return 0
}

@bagder
Copy link
Member

bagder commented Apr 26, 2024

Task-1, even if it downloads the allocated data size, it doesn't disconnect and still downloads all the data.

Come again? What do you mean that you do or don't do with the first transfer?

@lazulkia
Copy link
Author

lazulkia commented Apr 26, 2024

According to the example of 10-at-a-time.html, if you use 6 chunks to download, the main thread needs to create these 6 tasks at once after getting the "content-length" (assuming the file size is 6):

task-1 Range=bytes=0-1
this_1.task.size =1
...
task-6 Range=bytes=5-6
this_6.task.size =1

Note that here task-1 has a normal task size.

The way I used it, I skipped the process of getting the "content-length" in the main thread, and created only one task, with "Range=bytes=0-" set in task-1.
So the event that creates the other 5 tasks happens in task-1 headerfunction

task-1 headerfunction => get content-length
this_1.task.size = content-length / 6

loop 5
Creation of 5 other tasks

This means that task-1 will download all the data (6) and I have to actively terminate it after it finishes downloading the specified content

write_callback_function(){
if this.receive_size >= this.task_size
return 0
}

And I found that task-1 write_callback_function() return 0 causes the other 5 tasks to get stuck in a loop. So the way to download 100% normally is to not actively end it and let task-1 download all the data (6).

@dfandrich
Copy link
Contributor

10-at-a-time.c is a single-threaded example, so referring to its thread as "the main thread" is a bit incongruous since there is no other thread. It sounds like you've converted that example into a multithreaded program. Improper use of multithreading in libcurl is a frequent source of problems. Did you abide by everything in https://curl.se/libcurl/c/threadsafe.html when you converted the example?

@lazulkia
Copy link
Author

lazulkia commented Apr 26, 2024

It was my poor choice of words. I'm not using multithreading.

You can interpret the improper "threads" here as:
main thread = multi loop()
sub thread = task_callback()

@dfandrich
Copy link
Contributor

dfandrich commented Apr 26, 2024 via email

@lazulkia
Copy link
Author

lazulkia commented Apr 26, 2024

I can't understand what "limitations" you're talking about, but I'm not operating the curl api in write_callback_function() other than using RtlMoveMemory() to read the data.
There are only two methods for "return":
return 0
return size * memb

Regarding task-1's header_callback_function, it also doesn't manipulate the curl api, it just creates 5 task objects with task information. In the multi loop{}, these objects are added to multi, consistent with '10-at-a-time.c'.

@dfandrich
Copy link
Contributor

dfandrich commented Apr 26, 2024 via email

@lazulkia
Copy link
Author

Unless someone explicitly says they need the AutoHotkey code, I can't provide it.

@lazulkia
Copy link
Author

I've summarized the current problem as follows:
In multi's multitasking download (using the same proxy), if one task aborts the connection with "return 0" at write_callback(), the proxy for the other tasks fails.

@bagder
Copy link
Member

bagder commented Apr 27, 2024

I think maybe the odd terms in use makes it hard to speak the same language.

curl performs transfers. You can do multiple transfers with it. They are not multi-tasking.

By your description it sounds as if you add new transfers from within a callback, which sounds odd. libcurl tries to forbid that. So are you?

And again: you say repeatedly that "the other transfers fail", but I want to know more specifically and exactly how they fail. Details!

You say you do six transfers. Can you describe exactly what each transfer does?

@lazulkia
Copy link
Author

I searched all my statements and did not find the word "transfers".

If you look carefully, you will find it:
"
I'm not operating the curl api in write_callback_function()
Regarding task-1's header_callback_function, it also doesn't manipulate the curl api
"

@lazulkia
Copy link
Author

lazulkia commented Apr 27, 2024

I don't know C at all, but I modified 10-at-a-time.c as a schematic. If you're still confused, close this issue, it's not worth the effort.



static size_t write_cb(char *data, size_t n, size_t l, void *userp)
{
	// this = current task
	/* take care of the data here, ignored in this example */
	(void)data;
	(void)userp;
	this.receive_size += n * l
	
	if this.receive_size >= this.task_size // Only task 1 requires this judgment
		return 0
		
	return n * l;
}

static task_list=[]

static size_t write_head(char *data, size_t n, size_t l, void *userp)
{
	// this = task.1
	if str=="\r\n" 
	&&  if head has content-length
	&& if head has content-range
	{
		this.task_size := content-length/6
		offset = content-length/6
		loop 5 //Create 5 objects containing task information
		{
			task_list.push({url: this.url, Range:"Range:bytes=%offset%-%offset+(content-length/6)%"})
		}
		
	}
	
	/* take care of the data here, ignored in this example */
	(void)data;
	(void)userp;
	return n * l;
}





int main(void)
{
	CURLM *cm;
	CURLMsg *msg;
	unsigned int transfers = 0;
	int msgs_left = -1;
	int left = 0;

	curl_global_init(CURL_GLOBAL_ALL);
	cm = curl_multi_init();

	/* Limit the amount of simultaneous connections curl should allow: */
	curl_multi_setopt(cm, CURLMOPT_MAXCONNECTS, (long)MAX_PARALLEL);

// add  task.1
	CURL *eh = curl_easy_init();
	curl_easy_setopt(eh, CURLOPT_WRITEFUNCTION, write_cb);
	curl_easy_setopt(eh, CURLOPT_HEADERFUNCTION, write_head);
	curl_easy_setopt(eh, CURLOPT_URL, "https://dl.google.com/drive-file-stream/GoogleDriveSetup.exe ");
	curl_slist_append(xx,"Range=bytes=0-")
	
	
	curl_multi_add_handle(cm, eh);
	(*left)++;


	do
	{
		int still_alive = 1;
		curl_multi_perform(cm, &still_alive);

		while ((msg = curl_multi_info_read(cm, &msgs_left)))
		{
			if (msg->msg == CURLMSG_DONE)
			{
				char *url;
				CURL *e = msg->easy_handle;
				curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE, &url);
				fprintf(stderr, "R: %d - %s <%s>\n",
					   msg->data.result, curl_easy_strerror(msg->data.result), url);
				curl_multi_remove_handle(cm, e);
				curl_easy_cleanup(e);
				left--;
			}
			else
			{
				fprintf(stderr, "E: CURLMsg (%d)\n", msg->msg);
			}
			
			if task_list
			for x in task_list
			{
			CURL *eh = curl_easy_init();
			curl_easy_setopt(eh, CURLOPT_WRITEFUNCTION, write_cb);

			curl_easy_setopt(eh, CURLOPT_URL, x.url);
			curl_slist_append(xx,x.Range)

			curl_multi_add_handle(cm, eh);
			(*left)++;
				
			}

		}
		if (left)
			curl_multi_wait(cm, NULL, 0, 1000, NULL);

	} while (left);

	curl_multi_cleanup(cm);
	curl_global_cleanup();

	return EXIT_SUCCESS;
}


@bagder
Copy link
Member

bagder commented Apr 27, 2024

I searched all my statements and did not find the word "transfers".

That's exactly the problem: you refer to them as tasks but they are not tasks. They are transfers.

Your code refers to a this as "current task" - but the same callback function is used for all transfers, so what is "the current task" ?

@lazulkia
Copy link
Author

lazulkia commented Apr 27, 2024

AutoHotkey can attach data when creating callbacks." The "this" variable is just to record the task_size and nothing else. Like this new example



static size_t write_cb(this, char *data, size_t n, size_t l, void *userp)
{
	/* take care of the data here, ignored in this example */
	(void)data;
	(void)userp;
	this.receive_size += n * l
	
	if this.receive_size >= this.task_size // Only task 1 requires this judgment
		return 0
		
	return n * l;
}

static task_list=[]

static size_t write_head(this, char *data, size_t n, size_t l, void *userp)
{
	if str=="\r\n" 
	&&  if head has content-length
	&& if head has content-range
	{
		this.task_size := content-length/6
		offset = content-length/6
		loop 5 //Create 5 objects containing task information
		{
			task_list.push({url: this.url, task_size: content-length/6, Range:"Range:bytes=%offset%-%offset+(content-length/6)%"})
		}
		
	}
	
	/* take care of the data here, ignored in this example */
	(void)data;
	(void)userp;
	return n * l;
}





int main(void)
{
	CURLM *cm;
	CURLMsg *msg;
	unsigned int transfers = 0;
	int msgs_left = -1;
	int left = 0;

	curl_global_init(CURL_GLOBAL_ALL);
	cm = curl_multi_init();

	/* Limit the amount of simultaneous connections curl should allow: */
	curl_multi_setopt(cm, CURLMOPT_MAXCONNECTS, (long)MAX_PARALLEL);

// add  task.1
	CURL *eh = curl_easy_init();
	t_1  = {eh: eh, task_size:0, url: "https://dl.google.com/drive-file-stream/GoogleDriveSetup.exe"}
	curl_easy_setopt(eh, CURLOPT_WRITEFUNCTION, write_cb.bind(t_1));
	curl_easy_setopt(eh, CURLOPT_HEADERFUNCTION, write_head.bind(t_1));
	curl_easy_setopt(eh, CURLOPT_URL, "https://dl.google.com/drive-file-stream/GoogleDriveSetup.exe");
	curl_slist_append(xx,"Range=bytes=0-")
	
	
	curl_multi_add_handle(cm, eh);
	(*left)++;


	do
	{
		int still_alive = 1;
		curl_multi_perform(cm, &still_alive);

		while ((msg = curl_multi_info_read(cm, &msgs_left)))
		{
			if (msg->msg == CURLMSG_DONE)
			{
				char *url;
				CURL *e = msg->easy_handle;
				curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE, &url);
				fprintf(stderr, "R: %d - %s <%s>\n",
					   msg->data.result, curl_easy_strerror(msg->data.result), url);
				curl_multi_remove_handle(cm, e);
				curl_easy_cleanup(e);
				left--;
			}
			else
			{
				fprintf(stderr, "E: CURLMsg (%d)\n", msg->msg);
			}
			
			if task_list
			for x in task_list
			{
			CURL *eh = curl_easy_init();
			curl_easy_setopt(eh, CURLOPT_WRITEFUNCTION, write_cb.bind(x));

			curl_easy_setopt(eh, CURLOPT_URL, x.url);
			curl_slist_append(xx,x.Range)

			curl_multi_add_handle(cm, eh);
			(*left)++;
				
			}

		}
		if (left)
			curl_multi_wait(cm, NULL, 0, 1000, NULL);

	} while (left);

	curl_multi_cleanup(cm);
	curl_global_cleanup();

	return EXIT_SUCCESS;
}



@bagder
Copy link
Member

bagder commented Apr 27, 2024

I'm sorry but none of this brings me any closer to understanding the issue. I don't understand your program. I don't even know how the transfers fail. I can't reproduce the problem.

@lazulkia
Copy link
Author

Ok, thanks for the reply. I'm sure you didn't forget to use the SOCKS5 proxy in your test and you must have triggered the "return 0" in your test. I don't have any more information to provide, so it looks like I'm giving meaningless issues, so please ignore the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants