15:20:53 GMT is there such a thing as an "acceptable ammount of running redis instances/servers" on a machine? 15:21:19 GMT or just go crazy with it 15:30:13 GMT parazyd, RAM and CPU are your real limits 15:30:14 GMT parazyd: the optimum is probably one per core 15:30:18 GMT likely 15:30:24 GMT depending on what kind of load they get 15:30:30 GMT and what your reason for having multiple instances is 15:30:46 GMT the only time i did multiple instances they all had the same data 15:30:51 GMT so i could throw more than one cpu core at it 15:31:32 GMT easier usage/navigation 15:31:54 GMT ram isn't a limit theoretically 15:32:13 GMT and cpu? 15:32:19 GMT one instance is certainly easier to manage than a cluster 15:32:28 GMT dunno, probably 4 cores 15:32:51 GMT each one would keep specific data, in ~127 databases each 15:33:10 GMT so depending on the specified query, i would call a specific server 15:33:52 GMT i don't know how else i would navigate(?) if everything was on the same server, with this there is kind of a pattern 15:34:19 GMT that's a lot of databases 15:34:44 GMT <*> parazyd shrugs 15:35:09 GMT ¯\_(ツ)_/¯ 15:35:09 GMT it's not much load from my current understanding 15:35:15 GMT so it's doable 15:35:43 GMT and in the 127 oer server there is a pattern, where i could use a python list or whatever to fill it up 15:35:49 GMT s/oer/per/ 15:40:23 GMT but why not one instance? 15:41:11 GMT I don't know how to make navigation easy 15:41:23 GMT let me show you what i'm talking about 15:41:37 GMT please do 15:42:17 GMT i'll paste a sprunge text, easier than spamming here. give me a sec 15:46:01 GMT b 15:46:07 GMT http://sprunge.us/IhDb 15:46:10 GMT here's a blurb 15:48:16 GMT why do you need to put each thing in a repo in its own DB? 15:48:53 GMT because the key is the package name, and the value is another hashmap which i can use when i get it from redis 15:49:27 GMT example file: http://packages.devuan.org/merged/dists/jessie-backports/contrib/binary-all/Packages 15:50:57 GMT what exactly are you storing, and what's the purpose of storing/how do you access it? (search for it, get the hash by package name, etc) 15:52:22 GMT each paragraph (^\n separation) in the Packages file is an entry. so the key is the package name (first line), and the value is the whole package info converted to a hashmap 15:52:28 GMT i use it through python 15:52:59 GMT the goal is to have overlays of these files. so i have one iteration of the Packages file, then overlay another one, and another one... etc 15:53:17 GMT when done, i generate a new 'Package' file with the result 15:53:28 GMT ah 15:53:31 GMT maybe redis is the wrong tool here 15:53:40 GMT yeah, you can do that in python 15:53:56 GMT so what do you store in the different databases? 15:54:08 GMT i'm still confused about that 15:54:28 GMT i was thinking a Package file per db 15:55:10 GMT ah 15:55:23 GMT you can just encode that info into the key 15:55:36 GMT what do you mean? 15:55:48 GMT e.g.: jessie-backports:contrib:binary-all: 15:56:05 GMT ! 15:56:08 GMT never thought of that 15:56:10 GMT that's for storing only though, it's not gonna help you with overlaying 15:56:40 GMT i can always have a temporary db where i do the overlaying 15:56:57 GMT with overlaying you mean if you take 2 repos you first add all packages from repo 1, then all from repo 2 and if one already exists it'll get overridden? 15:57:15 GMT no, the opposite 15:57:32 GMT top priority gets overriden only if it's from the same priority 15:57:35 GMT if lower, then it's dropped 15:57:45 GMT sounds hard to do in redis 15:57:59 GMT if possible at all 15:58:48 GMT yes indeed 15:59:12 GMT but i could use it for storing though 15:59:14 GMT where do you get the priority from? 15:59:20 GMT and then do the overlaying in python 15:59:34 GMT you could, but it's probably got no benefit over loading it from the file 16:00:05 GMT depending on the thing you're making works it might just be cheaper to keep the data in python in memory 16:01:06 GMT yeah i don't know how expensive is the parsing 16:01:08 GMT probably not much 16:01:17 GMT i think the biggest file is around 40M 16:01:52 GMT all in all, probably the first run is the toughest, which is about 4GB of data 16:02:10 GMT afterwards, i just do diffs and update what is needed 16:02:23 GMT how often do you run that? 16:02:28 GMT (that's another thing, that triggers this application) 16:02:37 GMT whenever a Package file is updated 16:02:51 GMT mh 16:02:59 GMT but it's the first run that's expensive, as i said 16:03:04 GMT so you DL the new Package and re-parse it? 16:03:09 GMT yes 16:03:20 GMT but only the one(s) that changed 16:03:59 GMT which i can easily do in shell/python as well, by looking at http headers 16:04:05 GMT so you'd run it as a script every time a Package is updated and produce the overlay of one (or hundreds) of specific configurations? 16:04:47 GMT yes 16:05:10 GMT one or hundreds? 16:05:36 GMT ive never used redis. ive dabbled on couchdb but i have redis installed on my linux box. Can i grab data from an ldap server and place it within redis? I only want communication to ldap when an ldap changes. is it possbile to script this and have redis cache all information? This will be for a php web aplication 16:05:42 GMT minus: what are hundreds in your context? files, or what's inside the files? 16:05:54 GMT parazyd: overlay configurations 16:06:29 GMT though that doesn't really matter much thinking about it 16:06:32 GMT i got lost... :D 16:06:53 GMT well, the configuration which overlays are used 16:07:03 GMT and with which priorities 16:07:10 GMT no that's a hashmap in python 16:07:21 GMT 3 or 4 overlays 16:07:58 GMT 0 is top, it's smallest, and 3 is lowest, it's biggest 16:08:28 GMT anyway yes, perhaps it's good to avoid redis in this usecase 16:10:15 GMT if you want to avoid reparsing on every run, either load everything into memory and keep it running (pretty much equivalent to redis) or dump the parsed data to a pickle 16:10:41 GMT yes it should be a daemon 16:10:44 GMT but the first thing i'd try is to just parse everything every time and go with that if it takes just a few seconds 16:10:59 GMT this is why i initially thought of redis 16:11:02 GMT should it take care of downloading those files too? 16:11:17 GMT if yes, then it's a daemon 16:11:26 GMT if no then a cronjob can run a shell script 16:11:42 GMT then inotify or the shell script can run this 16:12:18 GMT i think i kill 2 birds with one stone if i do it as a daemon, and download with python 16:12:25 GMT daemon sounds slightly better because no moving parts 16:13:45 GMT yeah, and threads work well :) 16:13:59 GMT yeah, threads should do 16:14:05 GMT alternative: asyncio 16:14:41 GMT ack :) 16:19:36 GMT minus: thanks for the tips, appreciate it 17:21:38 GMT Heya 17:23:46 GMT I'm looking at storing the number of requests per day, per ip. Any advice whether it's preferable to a) store a single hash per day, where the hash keys are the ip address, and hash values for the number of hits from that ip; or b) store a different key for every ip address per date ? 17:24:33 GMT ie a single "visits:" hash with lots of values, or multiple "visits::" keys with an integer value 17:25:07 GMT I don't need to cross-reference any of these values, get all of them in one go, or anything like that 17:36:27 GMT I'm mostly concerned about memory usage, but it would be nice to know about the performance characteristics of a hash vs lots of keys too 17:39:29 GMT unless you're already using redis for other things: use a time series DB 17:44:37 GMT I'm already using redis for other stuff, and don't have a time series db handy 17:48:31 GMT storing stuff in a hash seems to save a bunch of memory 17:48:41 GMT you can just do a quick benchmark anyway 17:50:39 GMT So... I've got a simple queuing system set up and, with it, a simple messaging system for my workers to give some additional notifications via pubsub. Only I have this weird problem. Any listeners (in python2) I have on Windows machines seem to eventually stop seeing notifications from the channel they're subscribed to. Wouldn't mind a hint on how to sort that out... 17:51:47 GMT I'm suspecting a socket buffer overflow on the listeners, but the messages being sent are so small and so infrequent, I'm not sure how that might happen 17:52:48 GMT jdelStro1her: in the end it depends on how you need to access the data 17:53:36 GMT Fweeb: i'd check the connection status in something like process hacker 17:54:07 GMT or take a look at the connection with wireshark 17:55:17 GMT minus: been monitoring with wireshark. A little tough to find correlations there. Nothing jumps out as being out of the ordinary. But then again, I don't frequently use wireshark 17:55:56 GMT well, it should be obvious to see what happens on the connection; like windows replying with ICMP errors 17:56:57 GMT Just some resets, AFAICT 18:00:38 GMT "just" 18:00:45 GMT RST means bad 18:02:14 GMT In that case... I have a better idea of where to look now. :) 18:05:36 GMT is the connection idle for a longer while? 18:05:40 GMT at time 18:05:41 GMT at times 18:21:38 GMT minus: It can be 18:23:01 GMT is there some kind of NAT or so between client and server? 18:24:04 GMT The funny thing i that linux clients don't have this issue 18:38:00 GMT s/funny/confusing 18:40:50 GMT maybe different default socket timeouts 18:40:57 GMT no idea if sockets have one by default 19:53:20 GMT turns out... it wasn't related to socket buffers, but to the stdout buffer (the listener was launched via subprocess) 19:53:59 GMT i.e. stdout=PIPE and that was never read from, filled up and locked the process?