• 0 Posts
Joined 1 year ago
Cake day: June 11th, 2023

  • if you technically pull people out of poverty by outsourcing to the lowest paying, least labor regulated parts of the world, is the fact that extreme poverty went away in those areas even a good thing?

    Yes. Your prospects of a healthy life increase when going from not being able to provide for yourself to being barely able to provide for yourself by working in fantastically poor conditions.

    If a sweatshop didn’t provide more worker value than extreme poverty, people just wouldn’t work there.

    The bare minimum of improvements is still an improvement, and that we should strive for better than the bare minimum doesn’t make the bare minimum worthless to the people who got it.

  • Oh, certainly. But common language has a term for high latency already, it’s just not speed related. Everyone knows about a laggy connection on a phone or video call.

    Fun fact: TCP has some implicit design considerations around the maximum cost of packet retransmission on a viable link that only works on roughly local planetary scale.
    When NASA started to get out to Mars with the space Internet, they needed to tweak tcp to fit retransmission being proportionally much more expensive and let connections live longer before being “broken”.

  • Yes, to a degree. A VPN protects you from an attacker on the same WiFi network as you and that’s about it.

    Most assaults on your privacy don’t happen like that, and for the most part the attacks that do happen like that are stopped by the website using https and proper modern security.
    The benefit of the VPN is that it puts some of that protection under your control, but only as far as your VPN provider.

    A VPN is about as much protection from most cyber attacks as a gun is.

    They’re not a security tool, they’re a networking tool. They let you do some network stuff securely, and done correctly they can protect from some things, but the point of them is “this looks like a small, simple LAN, but it’s not”.

    It’s much easier to package and sell network tools than security tools, and they’re much more accepted by users, since security tools have a tendency to say “no” a lot, particularly when you might be doing something dumb,and users hate being told no, particularly when they’re doing something dumb.

  • Yeah, and it’s not like you want the information out there, it’s just that in my opinion it’s not something I would pay money for. Having the authority to make the request doesn’t mean that the party on the other end is obligated to comply, or in some cases even legally permitted to.

    I’ve used Google’s service where they send you an email to review results if they find something, and my Google results for my incredibly distinctive name are basically only professional resources that I kinda want to be findable.

  • Honestly? It’s not something I would pay for. Google has their own service where they’ll let you know if they find your information and you can ask them to remove the search result.
    Beyond that, there’s some information that you just fundamentally can’t make private and no service can get taken down.
    Most data mining sites just collect those public records and put them next to each other, so they get a pile of your name, birthday, where you were born, how active you are as a voter and all that stuff.

    Removing your address from Google maps just seems silly to me. That there is a residence there is fundamentally public information, not being on maps doesn’t make it less public it just probably causes issues for delivery drivers.

    Anyone who has your data and is going to be a jerk about it isn’t going to listen to a request to take it down either. They’re just going to send you spam messages.

    The odds of being Targeted by a determined individual who’s focused explicitly on you is low. They tend to target a broad swath of people, and then dig in on people who take the bait a few times.

  • I have never felt so old.

    Name, address, and phone number of the account holder used to be published in books that got sent to everyone in the city and also just left lying in boxes that had phones in them if you needed to make a call while you weren’t home, because your phone used to be tied to a physical location.
    You also used to have to pay extra to make calls to places far away because it used more phone circuits. And by “far away” I mean roughly 50 miles.

    It’s not the biggest thing in the world, privacy wise, since a surprising amount of information is considered public.
    If you know an address, it’s pretty much trivial to find the owners name, basic layout of the house, home value, previous owners, utility bill information, tax payments, and so on. I looked up my information and was able to pretty easily get the records for my house, showing I pay my bills on time, when I got my air conditioner replaced and who the contractor who did it was.

    As an example, here’s the property record for a parking structure owned by the state of Michigan. I chose a public building accessible by anyone and owned by a government to avoid randomly doxing someone, but it’s really as easy as searching for public records for some county or city and you’ll find something pretty fast.

  • Depends on the vendor for the specifics. In general, they don’t protect against an attacker who has gained persistent privileged access to the machine, only against theft.
    Since the key either can’t leave the tpm or is useless without it (some tpms have one key that it can never return, and will generate a new key and return it encrypted with it’s internal key. This means you get protection but don’t need to worry about storage on the chip), the attacker needs to remain undetected on the server as long as they want to use it, which is difficult for anyone less sophisticated than an advanced persistent threat.

    The Apple system, to its credit, does a degree of user and application validation to use the keys. Generally good for security, but it makes it so if you want to share a key between users you probably won’t be using the secure enclave.

    Most of the trust checks end up being the tpm proving itself to the remote service that’s checking the service. For example, when you use your phones biometrics to log into a website, part of that handshake is the tpm on the phone proving that it’s made by a company to a spec validated by the standards to be secure in the way it’s claiming.

  • Package signing is used to make sure you only get packages from sources you trust.
    Every Linux distro does it and it’s why if you add a new source for packages you get asked to accept a key signature.

    For a long time, the keys used for signing were just files on disk, and you protected them by protecting the server they were on, but they were technically able to be stolen and used to sign malicious packages.

    Some advanced in chip design and cost reductions later, we now have what is often called a “secure enclave”, “trusted platform module”, or a general provider for a non-exportable key.
    It’s a little chip that holds or manages a cryptographic key such that it can’t (or is exceptionally difficult) to get the signing key off the chip or extract it, making it nearly impossible to steal the key without actually physically stealing the server, which is much easier to prevent by putting it in a room with doors, and impossible to do without detection, making a forged package vastly less likely.

    There are services that exist that provide the infrastructure needed to do this, but they cost money and it takes time and money to build it into your system in a way that’s reliable and doesn’t lock you to a vendor if you ever need to switch for whatever reason.

    So I believe this is valve picking up the bill to move archs package infrastructure security up to the top tier.
    It was fine before, but that upgrade is expensive for a volunteer and donation based project and cheap for a high profile company that might legitimately be worried about their use of arch on physical hardware increasing the threat interest.

  • Eeeh, I still think diving into the weeds of the technical is the wrong way to approach it. Their argument is that training isn’t copyright violation, not that sufficient training dilutes the violation.

    Even if trained only on one source, it’s quite unlikely that it would generate copyright infringing output. It would be vastly less intelligible, likely to the point of overtly garbled words and sentences lacking much in the way of grammar.

    If what they’re doing is technically an infringement or how it works is entirely aside from a discussion on if it should be infringement or permitted.

  • Basing your argument around how the model or training system works doesn’t seem like the best way to frame your point to me. It invites a lot of mucking about in the details of how the systems do or don’t work, how humans learn, and what “learning” and “knowledge” actually are.

    I’m a human as far as I know, and it’s trivial for me to regurgitate my training data. I regularly say things that are either directly references to things I’ve heard, or accidentally copy them, sometimes with errors.
    Would you argue that I’m just a statistical collage of the things I’ve experienced, seen or read? My brain has as many copies of my training data in it as the AI model, namely zero, but “Captain Picard of the USS Enterprise sat down for a rousing game of chess with his friend Sherlock Holmes, and then Shakespeare came in dressed like Mickey mouse and said ‘to be or not to be, that is the question, for tis nobler in the heart’ or something”. Direct copies of someone else’s work, as well as multiple copyright infringements.
    I’m also shit at drawing with perspective. It comes across like a drunk toddler trying their hand at cubism.

    Arguing about how the model works or the deficiencies of it to justify treating it differently just invites fixing those issues and repeating the same conversation later. What if we make one that does work how humans do in your opinion? Or it properly actually extracts the information in a way that isn’t just statistically inferred patterns, whatever the distinction there is? Does that suddenly make it different?

    You don’t need to get bogged down in the muck of the technical to say that even if you conceed every technical point, we can still say that a non-sentient machine learning system can be held to different standards with regards to copyright law than a sentient person. A person gets to buy a book, read it, and then carry around that information in their head and use it however they want. Not-A-Person does not get to read a book and hold that information without consent of the author.
    Arguing why it’s bad for society for machines to mechanise the production of works inspired by others is more to the point.

    Computers think the same way boats swim. Arguing about the difference between hands and propellers misses the point that you don’t want a shrimp boat in your swimming pool. I don’t care why they’re different, or that it technically did or didn’t violate the “free swim” policy, I care that it ruins the whole thing for the people it exists for in the first place.

    I think all the AI stuff is cool, fun and interesting. I also think that letting it train on everything regardless of the creators wishes has too much opportunity to make everything garbage. Same for letting it produce content that isn’t labeled or cited.
    If they can find a way to do and use the cool stuff without making things worse, they should focus on that.

  • As written the headline is pretty bad, but it seems their argument is that they should be able to train from publicly available copywritten information, like blog posts and social media, and not from private copywritten information like movies or books.

    You can certainly argue that “downloading public copywritten information for the purposes of model training” should be treated differently from “downloading public copywritten information for the intended use of the copyright holder”, but it feels disingenuous to put this comment itself, to which someone has a copyright, into the same category as something not shared publicly like a paid article or a book.

    Personally, I think it’s a lot like search engines. If you make something public someone can analyze it, link to it, or derivative actions, but they can’t copy it and share the copy with others.