diff options
author | Samuel Thibault <samuel.thibault@ens-lyon.org> | 2009-12-16 01:11:51 +0100 |
---|---|---|
committer | Samuel Thibault <samuel.thibault@ens-lyon.org> | 2009-12-16 01:11:51 +0100 |
commit | 55dbc2b5d857d35262ad3116803dfb31b733d031 (patch) | |
tree | 9b9a7d7952ff74855de2a6c348e9c856a531158d /xen/net.c | |
parent | 870925205c78415dc4c594bfae9de8eb31745b81 (diff) | |
download | gnumach-55dbc2b5d857d35262ad3116803dfb31b733d031.tar.gz gnumach-55dbc2b5d857d35262ad3116803dfb31b733d031.tar.bz2 gnumach-55dbc2b5d857d35262ad3116803dfb31b733d031.zip |
Add Xen support
2009-03-11 Samuel Thibault <samuel.thibault@ens-lyon.org>
* i386/i386/vm_param.h (VM_MIN_KERNEL_ADDRESS) [MACH_XEN]: Set to
0x20000000.
* i386/i386/i386asm.sym (pfn_list) [VM_MIN_KERNEL_ADDRESS ==
LINEAR_MIN_KERNEL_ADDRESS]: Define to constant PFN_LIST.
2009-02-27 Samuel Thibault <samuel.thibault@ens-lyon.org>
* i386/intel/pmap.c [MACH_HYP] (INVALIDATE_TLB): Call hyp_invlpg
instead of flush_tlb when e - s is compile-time known to be
PAGE_SIZE.
2008-11-27 Samuel Thibault <samuel.thibault@ens-lyon.org>
* i386/configfrag.ac (enable_pae): Enable by default on the Xen
platform.
2007-11-14 Samuel Thibault <samuel.thibault@ens-lyon.org>
* i386/i386at/model_dep.c (machine_relax): New function.
(c_boot_entry): Refuse to boot as dom0.
2007-10-17 Samuel Thibault <samuel.thibault@ens-lyon.org>
* i386/i386/fpu.c [MACH_XEN]: Disable unused fpintr().
2007-08-12 Samuel Thibault <samuel.thibault@ens-lyon.org>
* Makefile.am (clib_routines): Add _START.
* i386/xen/xen_boothdr: Use _START for VIRT_BASE and PADDR_OFFSET. Add
GUEST_VERSION and XEN_ELFNOTE_FEATURES.
2007-06-13 Samuel Thibault <samuel.thibault@ens-lyon.org>
* i386/i386/user_ldt.h (user_ldt) [MACH_XEN]: Add alloc field.
* i386/i386/user_ldt.c (i386_set_ldt) [MACH_XEN]: Round allocation of
LDT to a page, set back LDT pages read/write before freeing them.
(user_ldt_free) [MACH_XEN]: Likewise.
2007-04-18 Samuel Thibault <samuel.thibault@ens-lyon.org>
* device/ds_routines.c [MACH_HYP]: Add hypervisor block and net devices.
2007-02-19 Thomas Schwinge <tschwinge@gnu.org>
* i386/xen/Makefrag.am [PLATFORM_xen] (gnumach_LINKFLAGS): Define.
* Makefrag.am: Include `xen/Makefrag.am'.
* configure.ac: Include `xen/configfrag.ac'.
(--enable-platform): Support the `xen' platform.
* i386/configfrag.ac: Likewise.
* i386/Makefrag.am [PLATFORM_xen]: Include `i386/xen/Makefrag.am'.
2007-02-19 Samuel Thibault <samuel.thibault@ens-lyon.org>
Thomas Schwinge <tschwinge@gnu.org>
* i386/xen/Makefrag.am: New file.
* xen/Makefrag.am: Likewise.
* xen/configfrag.ac: Likewise.
2007-02-11 (and later dates) Samuel Thibault <samuel.thibault@ens-lyon.org>
Xen support
* Makefile.am (clib_routines): Add _start.
* Makefrag.am (include_mach_HEADERS): Add include/mach/xen.h.
* device/cons.c (cnputc): Call hyp_console_write.
* i386/Makefrag.am (libkernel_a_SOURCES): Move non-Xen source to
[PLATFORM_at].
* i386/i386/debug_trace.S: Include <i386/xen.h>
* i386/i386/fpu.c [MACH_HYP] (init_fpu): Call set_ts() and clear_ts(),
do not enable CR0_EM.
* i386/i386/gdt.c: Include <mach/xen.h> and <intel/pmap.h>.
[MACH_XEN]: Make gdt array extern.
[MACH_XEN] (gdt_init): Register gdt with hypervisor. Request 4gb
segments assist. Shift la_shift.
[MACH_PSEUDO_PHYS] (gdt_init): Shift pfn_list.
* i386/i386/gdt.h [MACH_XEN]: Don't define KERNEL_LDT and LINEAR_DS.
* i386/i386/i386asm.sym: Include <i386/xen.h>.
[MACH_XEN]: Remove KERNEL_LDT, Add shared_info's CPU_CLI, CPU_PENDING,
CPU_PENDING_SEL, PENDING, EVTMASK and CR2.
* i386/i386/idt.c [MACH_HYP] (idt_init): Register trap table with
hypervisor.
* i386/i386/idt_inittab.S: Include <i386/i386asm.h>.
[MACH_XEN]: Set IDT_ENTRY() for hypervisor. Set trap table terminator.
* i386/i386/ktss.c [MACH_XEN] (ktss_init): Request exception task switch
from hypervisor.
* i386/i386/ldt.c: Include <mach/xen.h> and <intel/pmap.h>
[MACH_XEN]: Make ldt array extern.
[MACH_XEN] (ldt_init): Set ldt readwrite.
[MACH_HYP] (ldt_init): Register ldt with hypervisor.
* i386/i386/locore.S: Include <i386/xen.h>. Handle KERNEL_RING == 1
case.
[MACH_XEN]: Read hyp_shared_info's CR2 instead of %cr2.
[MACH_PSEUDO_PHYS]: Add mfn_to_pfn computation.
[MACH_HYP]: Drop Cyrix I/O-based detection. Read cr3 instead of %cr3.
Make hypervisor call for pte invalidation.
* i386/i386/mp_desc.c: Include <mach/xen.h>.
[MACH_HYP] (mp_desc_init): Panic.
* i386/i386/pcb.c: Include <mach/xen.h>.
[MACH_XEN] (switch_ktss): Request stack switch from hypervisor.
[MACH_HYP] (switch_ktss): Request ldt and gdt switch from hypervisor.
* i386/i386/phys.c: Include <mach/xen.h>
[MACH_PSEUDO_PHYS] (kvtophys): Do page translation.
* i386/i386/proc_reg.h [MACH_HYP] (cr3): New declaration.
(set_cr3, get_cr3, set_ts, clear_ts): Implement macros.
* i386/i386/seg.h [MACH_HYP]: Define KERNEL_RING macro. Include
<mach/xen.h>
[MACH_XEN] (fill_descriptor): Register descriptor with hypervisor.
* i386/i386/spl.S: Include <i386/xen.h> and <i386/i386/asm.h>
[MACH_XEN] (pic_mask): #define to int_mask.
[MACH_XEN] (SETMASK): Implement.
* i386/i386/vm_param.h [MACH_XEN] (HYP_VIRT_START): New macro.
[MACH_XEN]: Set VM_MAX_KERNEL_ADDRESS to HYP_VIRT_START-
LINEAR_MIN_KERNEL_ADDRESS + VM_MIN_KERNEL_ADDRESS. Increase
KERNEL_STACK_SIZE and INTSTACK_SIZE to 4 pages.
* i386/i386at/conf.c [MACH_HYP]: Remove hardware devices, add hypervisor
console device.
* i386/i386at/cons_conf.c [MACH_HYP]: Add hypervisor console device.
* i386/i386at/model_dep.c: Include <sys/types.h>, <mach/xen.h>.
[MACH_XEN] Include <xen/console.h>, <xen/store.h>, <xen/evt.h>,
<xen/xen.h>.
[MACH_PSEUDO_PHYS]: New boot_info, mfn_list, pfn_list variables.
[MACH_XEN]: New la_shift variable.
[MACH_HYP] (avail_next, mem_size_init): Drop BIOS skipping mecanism.
[MACH_HYP] (machine_init): Call hyp_init(), drop hardware
initialization.
[MACH_HYP] (machine_idle): Call hyp_idle().
[MACH_HYP] (halt_cpu): Call hyp_halt().
[MACH_HYP] (halt_all_cpus): Call hyp_reboot() or hyp_halt().
[MACH_HYP] (i386at_init): Initialize with hypervisor.
[MACH_XEN] (c_boot_entry): Add Xen-specific initialization.
[MACH_HYP] (init_alloc_aligned, pmap_valid_page): Drop zones skipping
mecanism.
* i386/intel/pmap.c: Include <mach/xen.h>.
[MACH_PSEUDO_PHYS] (WRITE_PTE): Do page translation.
[MACH_HYP] (INVALIDATE_TLB): Request invalidation from hypervisor.
[MACH_XEN] (pmap_map_bd, pmap_create, pmap_destroy, pmap_remove_range)
(pmap_page_protect, pmap_protect, pmap_enter, pmap_change_wiring)
(pmap_attribute_clear, pmap_unmap_page_zero, pmap_collect): Request MMU
update from hypervisor.
[MACH_XEN] (pmap_bootstrap): Request pagetable initialization from
hypervisor.
[MACH_XEN] (pmap_set_page_readwrite, pmap_set_page_readonly)
(pmap_set_page_readonly_init, pmap_clear_bootstrap_pagetable)
(pmap_map_mfn): New functions.
* i386/intel/pmap.h [MACH_XEN] (INTEL_PTE_GLOBAL): Disable global page
support.
[MACH_PSEUDO_PHYS] (pte_to_pa): Do page translation.
[MACH_XEN] (pmap_set_page_readwrite, pmap_set_page_readonly)
(pmap_set_page_readonly_init, pmap_clear_bootstrap_pagetable)
(pmap_map_mfn): Declare functions.
* i386/i386/xen.h: New file.
* i386/xen/xen.c: New file.
* i386/xen/xen_boothdr.S: New file.
* i386/xen/xen_locore.S: New file.
* include/mach/xen.h: New file.
* kern/bootstrap.c [MACH_XEN] (boot_info): Declare variable.
[MACH_XEN] (bootstrap_create): Rebase multiboot header.
* kern/debug.c: Include <mach/xen.h>.
[MACH_HYP] (panic): Call hyp_crash() without delay.
* linux/dev/include/asm-i386/segment.h [MACH_HYP] (KERNEL_CS)
(KERNEL_DS): Use ring 1.
* xen/block.c: New file.
* xen/block.h: Likewise.
* xen/console.c: Likewise.
* xen/console.h: Likewise.
* xen/evt.c: Likewise.
* xen/evt.h: Likewise.
* xen/grant.c: Likewise.
* xen/grant.h: Likewise.
* xen/net.c: Likewise.
* xen/net.h: Likewise.
* xen/ring.c: Likewise.
* xen/ring.h: Likewise.
* xen/store.c: Likewise.
* xen/store.h: Likewise.
* xen/time.c: Likewise.
* xen/time.h: Likewise.
* xen/xen.c: Likewise.
* xen/xen.h: Likewise.
* xen/public/COPYING: Import file from Xen.
* xen/public/callback.h: Likewise.
* xen/public/dom0_ops.h: Likewise.
* xen/public/domctl.h: Likewise.
* xen/public/elfnote.h: Likewise.
* xen/public/elfstructs.h: Likewise.
* xen/public/event_channel.h: Likewise.
* xen/public/features.h: Likewise.
* xen/public/grant_table.h: Likewise.
* xen/public/kexec.h: Likewise.
* xen/public/libelf.h: Likewise.
* xen/public/memory.h: Likewise.
* xen/public/nmi.h: Likewise.
* xen/public/physdev.h: Likewise.
* xen/public/platform.h: Likewise.
* xen/public/sched.h: Likewise.
* xen/public/sysctl.h: Likewise.
* xen/public/trace.h: Likewise.
* xen/public/vcpu.h: Likewise.
* xen/public/version.h: Likewise.
* xen/public/xen-compat.h: Likewise.
* xen/public/xen.h: Likewise.
* xen/public/xencomm.h: Likewise.
* xen/public/xenoprof.h: Likewise.
* xen/public/arch-x86/xen-mca.h: Likewise.
* xen/public/arch-x86/xen-x86_32.h: Likewise.
* xen/public/arch-x86/xen-x86_64.h: Likewise.
* xen/public/arch-x86/xen.h: Likewise.
* xen/public/arch-x86_32.h: Likewise.
* xen/public/arch-x86_64.h: Likewise.
* xen/public/io/blkif.h: Likewise.
* xen/public/io/console.h: Likewise.
* xen/public/io/fbif.h: Likewise.
* xen/public/io/fsif.h: Likewise.
* xen/public/io/kbdif.h: Likewise.
* xen/public/io/netif.h: Likewise.
* xen/public/io/pciif.h: Likewise.
* xen/public/io/protocols.h: Likewise.
* xen/public/io/ring.h: Likewise.
* xen/public/io/tpmif.h: Likewise.
* xen/public/io/xenbus.h: Likewise.
* xen/public/io/xs_wire.h: Likewise.
Diffstat (limited to 'xen/net.c')
-rw-r--r-- | xen/net.c | 665 |
1 files changed, 665 insertions, 0 deletions
diff --git a/xen/net.c b/xen/net.c new file mode 100644 index 00000000..1bb217ba --- /dev/null +++ b/xen/net.c @@ -0,0 +1,665 @@ +/* + * Copyright (C) 2006 Samuel Thibault <samuel.thibault@ens-lyon.org> + * + * This program is free software ; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation ; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY ; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with the program ; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include <sys/types.h> +#include <mach/mig_errors.h> +#include <ipc/ipc_port.h> +#include <ipc/ipc_space.h> +#include <vm/vm_kern.h> +#include <device/device_types.h> +#include <device/device_port.h> +#include <device/if_hdr.h> +#include <device/if_ether.h> +#include <device/net_io.h> +#include <device/device_reply.user.h> +#include <device/device_emul.h> +#include <intel/pmap.h> +#include <xen/public/io/netif.h> +#include <xen/public/memory.h> +#include <string.h> +#include <util/atoi.h> +#include "evt.h" +#include "store.h" +#include "net.h" +#include "grant.h" +#include "ring.h" +#include "time.h" +#include "xen.h" + +/* Hypervisor part */ + +#define ADDRESS_SIZE 6 +#define WINDOW __RING_SIZE((netif_rx_sring_t*)0, PAGE_SIZE) + +/* TODO: use rx-copy instead, since we're memcpying anyway */ + +/* Are we paranoid enough to not leak anything to backend? */ +static const int paranoia = 0; + +struct net_data { + struct device device; + struct ifnet ifnet; + int open_count; + char *backend; + domid_t domid; + char *vif; + u_char address[ADDRESS_SIZE]; + int handle; + ipc_port_t port; + netif_tx_front_ring_t tx; + netif_rx_front_ring_t rx; + void *rx_buf[WINDOW]; + grant_ref_t rx_buf_gnt[WINDOW]; + unsigned long rx_buf_pfn[WINDOW]; + evtchn_port_t evt; + simple_lock_data_t lock; + simple_lock_data_t pushlock; +}; + +static int n_vifs; +static struct net_data *vif_data; + +struct device_emulation_ops hyp_net_emulation_ops; + +int hextoi(char *cp, int *nump) +{ + int number; + char *original; + char c; + + original = cp; + for (number = 0, c = *cp | 0x20; (('0' <= c) && (c <= '9')) || (('a' <= c) && (c <= 'f')); c = *(++cp)) { + number *= 16; + if (c <= '9') + number += c - '0'; + else + number += c - 'a' + 10; + } + if (original == cp) + *nump = -1; + else + *nump = number; + return(cp - original); +} + +static void enqueue_rx_buf(struct net_data *nd, int number) { + unsigned reqn = nd->rx.req_prod_pvt++; + netif_rx_request_t *req = RING_GET_REQUEST(&nd->rx, reqn); + + assert(number < WINDOW); + + req->id = number; + req->gref = nd->rx_buf_gnt[number] = hyp_grant_accept_transfer(nd->domid, nd->rx_buf_pfn[number]); + + /* give back page */ + hyp_free_page(nd->rx_buf_pfn[number], nd->rx_buf[number]); +} + +static void hyp_net_intr(int unit) { + ipc_kmsg_t kmsg; + struct ether_header *eh; + struct packet_header *ph; + netif_rx_response_t *rx_rsp; + netif_tx_response_t *tx_rsp; + void *data; + int len, more; + struct net_data *nd = &vif_data[unit]; + + simple_lock(&nd->lock); + if ((nd->rx.sring->rsp_prod - nd->rx.rsp_cons) >= (WINDOW*3)/4) + printf("window %ld a bit small!\n", WINDOW); + + more = RING_HAS_UNCONSUMED_RESPONSES(&nd->rx); + while (more) { + rmb(); /* make sure we see responses */ + rx_rsp = RING_GET_RESPONSE(&nd->rx, nd->rx.rsp_cons++); + + unsigned number = rx_rsp->id; + assert(number < WINDOW); + unsigned long mfn = hyp_grant_finish_transfer(nd->rx_buf_gnt[number]); + +#ifdef MACH_PSEUDO_PHYS + mfn_list[nd->rx_buf_pfn[number]] = mfn; +#endif /* MACH_PSEUDO_PHYS */ + pmap_map_mfn(nd->rx_buf[number], mfn); + + kmsg = net_kmsg_get(); + if (!kmsg) + /* gasp! Drop */ + goto drop; + + if (rx_rsp->status <= 0) + switch (rx_rsp->status) { + case NETIF_RSP_DROPPED: + printf("Packet dropped\n"); + goto drop; + case NETIF_RSP_ERROR: + panic("Packet error"); + case 0: + printf("nul packet\n"); + goto drop; + default: + printf("Unknown error %d\n", rx_rsp->status); + goto drop; + } + + data = nd->rx_buf[number] + rx_rsp->offset; + len = rx_rsp->status; + + eh = (void*) (net_kmsg(kmsg)->header); + ph = (void*) (net_kmsg(kmsg)->packet); + memcpy(eh, data, sizeof (struct ether_header)); + memcpy(ph + 1, data + sizeof (struct ether_header), len - sizeof(struct ether_header)); + RING_FINAL_CHECK_FOR_RESPONSES(&nd->rx, more); + enqueue_rx_buf(nd, number); + ph->type = eh->ether_type; + ph->length = len - sizeof(struct ether_header) + sizeof (struct packet_header); + + net_kmsg(kmsg)->sent = FALSE; /* Mark packet as received. */ + + net_packet(&nd->ifnet, kmsg, ph->length, ethernet_priority(kmsg)); + continue; + +drop: + RING_FINAL_CHECK_FOR_RESPONSES(&nd->rx, more); + enqueue_rx_buf(nd, number); + } + + /* commit new requests */ + int notify; + wmb(); /* make sure it sees requests */ + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&nd->rx, notify); + if (notify) + hyp_event_channel_send(nd->evt); + + /* Now the tx side */ + more = RING_HAS_UNCONSUMED_RESPONSES(&nd->tx); + spl_t s = splsched (); + while (more) { + rmb(); /* make sure we see responses */ + tx_rsp = RING_GET_RESPONSE(&nd->tx, nd->tx.rsp_cons++); + switch (tx_rsp->status) { + case NETIF_RSP_DROPPED: + printf("Packet dropped\n"); + break; + case NETIF_RSP_ERROR: + panic("Packet error"); + case NETIF_RSP_OKAY: + break; + default: + printf("Unknown error %d\n", tx_rsp->status); + goto drop_tx; + } + thread_wakeup((event_t) hyp_grant_address(tx_rsp->id)); +drop_tx: + thread_wakeup_one(nd); + RING_FINAL_CHECK_FOR_RESPONSES(&nd->tx, more); + } + splx(s); + + simple_unlock(&nd->lock); +} + +#define VIF_PATH "device/vif" +void hyp_net_init(void) { + char **vifs, **vif; + char *c; + int i; + int n; + int grant; + char port_name[10]; + domid_t domid; + evtchn_port_t evt; + hyp_store_transaction_t t; + vm_offset_t addr; + struct net_data *nd; + struct ifnet *ifp; + netif_tx_sring_t *tx_ring; + netif_rx_sring_t *rx_ring; + + vifs = hyp_store_ls(0, 1, VIF_PATH); + if (!vifs) { + printf("eth: No net device (%s). Hoping you don't need any\n", hyp_store_error); + n_vifs = 0; + return; + } + + n = 0; + for (vif = vifs; *vif; vif++) + n++; + + vif_data = (void*) kalloc(n * sizeof(*vif_data)); + if (!vif_data) { + printf("eth: No memory room for VIF\n"); + n_vifs = 0; + return; + } + n_vifs = n; + + for (n = 0; n < n_vifs; n++) { + nd = &vif_data[n]; + mach_atoi((u_char *) vifs[n], &nd->handle); + if (nd->handle == MACH_ATOI_DEFAULT) + continue; + + nd->open_count = -2; + nd->vif = vifs[n]; + + /* Get domain id of frontend driver. */ + i = hyp_store_read_int(0, 5, VIF_PATH, "/", vifs[n], "/", "backend-id"); + if (i == -1) + panic("eth: couldn't read frontend domid of VIF %s (%s)",vifs[n], hyp_store_error); + nd->domid = domid = i; + + do { + t = hyp_store_transaction_start(); + + /* Get a page for tx_ring */ + if (kmem_alloc_wired(kernel_map, &addr, PAGE_SIZE) != KERN_SUCCESS) + panic("eth: couldn't allocate space for store tx_ring"); + tx_ring = (void*) addr; + SHARED_RING_INIT(tx_ring); + FRONT_RING_INIT(&nd->tx, tx_ring, PAGE_SIZE); + grant = hyp_grant_give(domid, atop(kvtophys(addr)), 0); + + /* and give it to backend. */ + i = sprintf(port_name, "%u", grant); + c = hyp_store_write(t, port_name, 5, VIF_PATH, "/", vifs[n], "/", "tx-ring-ref"); + if (!c) + panic("eth: couldn't store tx_ring reference for VIF %s (%s)", vifs[n], hyp_store_error); + kfree((vm_offset_t) c, strlen(c)+1); + + /* Get a page for rx_ring */ + if (kmem_alloc_wired(kernel_map, &addr, PAGE_SIZE) != KERN_SUCCESS) + panic("eth: couldn't allocate space for store tx_ring"); + rx_ring = (void*) addr; + SHARED_RING_INIT(rx_ring); + FRONT_RING_INIT(&nd->rx, rx_ring, PAGE_SIZE); + grant = hyp_grant_give(domid, atop(kvtophys(addr)), 0); + + /* and give it to backend. */ + i = sprintf(port_name, "%u", grant); + c = hyp_store_write(t, port_name, 5, VIF_PATH, "/", vifs[n], "/", "rx-ring-ref"); + if (!c) + panic("eth: couldn't store rx_ring reference for VIF %s (%s)", vifs[n], hyp_store_error); + kfree((vm_offset_t) c, strlen(c)+1); + + /* tell we need csums. */ + c = hyp_store_write(t, "1", 5, VIF_PATH, "/", vifs[n], "/", "feature-no-csum-offload"); + if (!c) + panic("eth: couldn't store feature-no-csum-offload reference for VIF %s (%s)", vifs[n], hyp_store_error); + kfree((vm_offset_t) c, strlen(c)+1); + + /* Allocate an event channel and give it to backend. */ + nd->evt = evt = hyp_event_channel_alloc(domid); + i = sprintf(port_name, "%lu", evt); + c = hyp_store_write(t, port_name, 5, VIF_PATH, "/", vifs[n], "/", "event-channel"); + if (!c) + panic("eth: couldn't store event channel for VIF %s (%s)", vifs[n], hyp_store_error); + kfree((vm_offset_t) c, strlen(c)+1); + c = hyp_store_write(t, hyp_store_state_initialized, 5, VIF_PATH, "/", vifs[n], "/", "state"); + if (!c) + panic("eth: couldn't store state for VIF %s (%s)", vifs[n], hyp_store_error); + kfree((vm_offset_t) c, strlen(c)+1); + } while ((!hyp_store_transaction_stop(t))); + /* TODO randomly wait? */ + + c = hyp_store_read(0, 5, VIF_PATH, "/", vifs[n], "/", "backend"); + if (!c) + panic("eth: couldn't get path to VIF %s backend (%s)", vifs[n], hyp_store_error); + nd->backend = c; + + while(1) { + i = hyp_store_read_int(0, 3, nd->backend, "/", "state"); + if (i == MACH_ATOI_DEFAULT) + panic("can't read state from %s", nd->backend); + if (i == XenbusStateInitWait) + break; + hyp_yield(); + } + + c = hyp_store_read(0, 3, nd->backend, "/", "mac"); + if (!c) + panic("eth: couldn't get VIF %s's mac (%s)", vifs[n], hyp_store_error); + + for (i=0; ; i++) { + int val; + hextoi(&c[3*i], &val); + if (val == -1) + panic("eth: couldn't understand %dth number of VIF %s's mac %s", i, vifs[n], c); + nd->address[i] = val; + if (i==ADDRESS_SIZE-1) + break; + if (c[3*i+2] != ':') + panic("eth: couldn't understand %dth separator of VIF %s's mac %s", i, vifs[n], c); + } + kfree((vm_offset_t) c, strlen(c)+1); + + printf("eth%d: dom%d's VIF %s ", n, domid, vifs[n]); + for (i=0; ; i++) { + printf("%02x", nd->address[i]); + if (i==ADDRESS_SIZE-1) + break; + printf(":"); + } + printf("\n"); + + c = hyp_store_write(0, hyp_store_state_connected, 5, VIF_PATH, "/", nd->vif, "/", "state"); + if (!c) + panic("couldn't store state for eth%d (%s)", nd - vif_data, hyp_store_error); + kfree((vm_offset_t) c, strlen(c)+1); + + /* Get a page for packet reception */ + for (i= 0; i<WINDOW; i++) { + if (kmem_alloc_wired(kernel_map, &addr, PAGE_SIZE) != KERN_SUCCESS) + panic("eth: couldn't allocate space for store tx_ring"); + nd->rx_buf[i] = (void*)phystokv(kvtophys(addr)); + nd->rx_buf_pfn[i] = atop(kvtophys((vm_offset_t)nd->rx_buf[i])); + if (hyp_do_update_va_mapping(kvtolin(addr), 0, UVMF_INVLPG|UVMF_ALL)) + panic("eth: couldn't clear rx kv buf %d at %p", i, addr); + /* and enqueue it to backend. */ + enqueue_rx_buf(nd, i); + } + int notify; + wmb(); /* make sure it sees requests */ + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&nd->rx, notify); + if (notify) + hyp_event_channel_send(nd->evt); + + + nd->open_count = -1; + nd->device.emul_ops = &hyp_net_emulation_ops; + nd->device.emul_data = nd; + simple_lock_init(&nd->lock); + simple_lock_init(&nd->pushlock); + + ifp = &nd->ifnet; + ifp->if_unit = n; + ifp->if_flags = IFF_UP | IFF_RUNNING; + ifp->if_header_size = 14; + ifp->if_header_format = HDR_ETHERNET; + /* Set to the maximum that we can handle in device_write. */ + ifp->if_mtu = PAGE_SIZE - ifp->if_header_size; + ifp->if_address_size = ADDRESS_SIZE; + ifp->if_address = (void*) nd->address; + if_init_queues (ifp); + + /* Now we can start receiving */ + hyp_evt_handler(evt, hyp_net_intr, n, SPL6); + } +} + +static ipc_port_t +dev_to_port(void *d) +{ + struct net_data *b = d; + if (!d) + return IP_NULL; + return ipc_port_make_send(b->port); +} + +static int +device_close(void *devp) +{ + struct net_data *nd = devp; + if (--nd->open_count < 0) + panic("too many closes on eth%d", nd - vif_data); + printf("close, eth%d count %d\n",nd-vif_data,nd->open_count); + if (nd->open_count) + return 0; + ipc_kobject_set(nd->port, IKO_NULL, IKOT_NONE); + ipc_port_dealloc_kernel(nd->port); + return 0; +} + +static io_return_t +device_open (ipc_port_t reply_port, mach_msg_type_name_t reply_port_type, + dev_mode_t mode, char *name, device_t *devp /* out */) +{ + int i, n, err = 0; + ipc_port_t port, notify; + struct net_data *nd; + + if (name[0] != 'e' || name[1] != 't' || name[2] != 'h' || name[3] < '0' || name[3] > '9') + return D_NO_SUCH_DEVICE; + i = mach_atoi((u_char *) &name[3], &n); + if (n == MACH_ATOI_DEFAULT) + return D_NO_SUCH_DEVICE; + if (name[3 + i]) + return D_NO_SUCH_DEVICE; + if (n >= n_vifs) + return D_NO_SUCH_DEVICE; + nd = &vif_data[n]; + if (nd->open_count == -2) + /* couldn't be initialized */ + return D_NO_SUCH_DEVICE; + + if (nd->open_count >= 0) { + *devp = &nd->device ; + nd->open_count++ ; + printf("re-open, eth%d count %d\n",nd-vif_data,nd->open_count); + return D_SUCCESS; + } + + nd->open_count = 1; + printf("eth%d count %d\n",nd-vif_data,nd->open_count); + + port = ipc_port_alloc_kernel(); + if (port == IP_NULL) { + err = KERN_RESOURCE_SHORTAGE; + goto out; + } + nd->port = port; + + *devp = &nd->device; + + ipc_kobject_set (port, (ipc_kobject_t) &nd->device, IKOT_DEVICE); + + notify = ipc_port_make_sonce (nd->port); + ip_lock (nd->port); + ipc_port_nsrequest (nd->port, 1, notify, ¬ify); + assert (notify == IP_NULL); + +out: + if (IP_VALID (reply_port)) + ds_device_open_reply (reply_port, reply_port_type, D_SUCCESS, dev_to_port(nd)); + else + device_close(nd); + return MIG_NO_REPLY; +} + +static io_return_t +device_write(void *d, ipc_port_t reply_port, + mach_msg_type_name_t reply_port_type, dev_mode_t mode, + recnum_t bn, io_buf_ptr_t data, unsigned int count, + int *bytes_written) +{ + vm_map_copy_t copy = (vm_map_copy_t) data; + grant_ref_t gref; + struct net_data *nd = d; + struct ifnet *ifp = &nd->ifnet; + netif_tx_request_t *req; + unsigned reqn; + vm_offset_t offset; + vm_page_t m; + vm_size_t size; + + /* The maximum that we can handle. */ + assert(ifp->if_header_size + ifp->if_mtu <= PAGE_SIZE); + + if (count < ifp->if_header_size || + count > ifp->if_header_size + ifp->if_mtu) + return D_INVALID_SIZE; + + assert(copy->type == VM_MAP_COPY_PAGE_LIST); + + assert(copy->cpy_npages <= 2); + assert(copy->cpy_npages >= 1); + + offset = copy->offset & PAGE_MASK; + if (paranoia || copy->cpy_npages == 2) { + /* have to copy :/ */ + while ((m = vm_page_grab(FALSE)) == 0) + VM_PAGE_WAIT (0); + assert (! m->active && ! m->inactive); + m->busy = TRUE; + + if (copy->cpy_npages == 1) + size = count; + else + size = PAGE_SIZE - offset; + + memcpy((void*)phystokv(m->phys_addr), (void*)phystokv(copy->cpy_page_list[0]->phys_addr + offset), size); + if (copy->cpy_npages == 2) + memcpy((void*)phystokv(m->phys_addr + size), (void*)phystokv(copy->cpy_page_list[1]->phys_addr), count - size); + + offset = 0; + } else + m = copy->cpy_page_list[0]; + + /* allocate a request */ + spl_t spl = splimp(); + while (1) { + simple_lock(&nd->lock); + if (!RING_FULL(&nd->tx)) + break; + thread_sleep(nd, &nd->lock, FALSE); + } + mb(); + reqn = nd->tx.req_prod_pvt++;; + simple_lock(&nd->pushlock); + simple_unlock(&nd->lock); + (void) splx(spl); + + req = RING_GET_REQUEST(&nd->tx, reqn); + req->gref = gref = hyp_grant_give(nd->domid, atop(m->phys_addr), 1); + req->offset = offset; + req->flags = 0; + req->id = gref; + req->size = count; + + assert_wait(hyp_grant_address(gref), FALSE); + + int notify; + wmb(); /* make sure it sees requests */ + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&nd->tx, notify); + if (notify) + hyp_event_channel_send(nd->evt); + simple_unlock(&nd->pushlock); + + thread_block(NULL); + + hyp_grant_takeback(gref); + + /* Send packet to filters. */ + { + struct packet_header *packet; + struct ether_header *header; + ipc_kmsg_t kmsg; + + kmsg = net_kmsg_get (); + + if (kmsg != IKM_NULL) + { + /* Suitable for Ethernet only. */ + header = (struct ether_header *) (net_kmsg (kmsg)->header); + packet = (struct packet_header *) (net_kmsg (kmsg)->packet); + memcpy (header, (void*)phystokv(m->phys_addr + offset), sizeof (struct ether_header)); + + /* packet is prefixed with a struct packet_header, + see include/device/net_status.h. */ + memcpy (packet + 1, (void*)phystokv(m->phys_addr + offset + sizeof (struct ether_header)), + count - sizeof (struct ether_header)); + packet->length = count - sizeof (struct ether_header) + + sizeof (struct packet_header); + packet->type = header->ether_type; + net_kmsg (kmsg)->sent = TRUE; /* Mark packet as sent. */ + spl_t s = splimp (); + net_packet (&nd->ifnet, kmsg, packet->length, + ethernet_priority (kmsg)); + splx (s); + } + } + + if (paranoia || copy->cpy_npages == 2) + VM_PAGE_FREE(m); + + vm_map_copy_discard (copy); + + *bytes_written = count; + + if (IP_VALID(reply_port)) + ds_device_write_reply (reply_port, reply_port_type, 0, count); + + return MIG_NO_REPLY; +} + +static io_return_t +device_get_status(void *d, dev_flavor_t flavor, dev_status_t status, + mach_msg_type_number_t *status_count) +{ + struct net_data *nd = d; + + return net_getstat (&nd->ifnet, flavor, status, status_count); +} + +static io_return_t +device_set_status(void *d, dev_flavor_t flavor, dev_status_t status, + mach_msg_type_number_t count) +{ + struct net_data *nd = d; + + switch (flavor) + { + default: + printf("TODO: net_%s(%p, 0x%x)\n", __func__, nd, flavor); + return D_INVALID_OPERATION; + } + return D_SUCCESS; +} + +static io_return_t +device_set_filter(void *d, ipc_port_t port, int priority, + filter_t * filter, unsigned filter_count) +{ + struct net_data *nd = d; + + if (!nd) + return D_NO_SUCH_DEVICE; + + return net_set_filter (&nd->ifnet, port, priority, filter, filter_count); +} + +struct device_emulation_ops hyp_net_emulation_ops = { + NULL, /* dereference */ + NULL, /* deallocate */ + dev_to_port, + device_open, + device_close, + device_write, + NULL, /* write_inband */ + NULL, + NULL, /* read_inband */ + device_set_status, /* set_status */ + device_get_status, + device_set_filter, /* set_filter */ + NULL, /* map */ + NULL, /* no_senders */ + NULL, /* write_trap */ + NULL, /* writev_trap */ +}; |